C#我选择正确了没？(C#与Delphi之间的速度差距有8倍左右)

winston · 发表于 2012-1-13 14:59:26

由项目需要，需要扫描1200万行的文本文件。经测试，发现C#与Delphi之间的差距，有点过分了。不多说，列代码测试：
下面是Delphi的代码：

//遍历文件查找回车出现的次数
function ScanEnterFile(const FileName:string):TInt64Array;
var
  MyFile:TMemoryStream;//文件内存
  rArray:TInt64Array;    //行索引结果集
  size,curIndex:int64;//文件大小，当前流位置
  enterCount:int64;//回车数量
  DoLoop:Boolean;//是否继续循环
  pc: PChar;
  arrayCount:int64;//当前索引数组大小
  addStep:integer;//检测到回车字符串时需要添加的步进
begin
  if fileName = '' then
Exit;
  if not FileExists(fileName) then
Exit;
  MyFile:=TMemoryStream.Create;//创建流
  MyFile.LoadFromFile(fileName);//把流入口映射到MyFile对象
  size:=MyFile.Size;
  pc:=MyFile.Memory; //把字符指针指向内存流
  curIndex:=RowLeast;
  DoLoop:=true;
  enterCount:=0;
  setlength(rArray,perArray);
  arrayCount:=perArray;
  enterCount:=0;
  rArray[enterCount]:=0;
  while DoLoop do
  begin
addStep:=0;
if (ord(pc[curIndex])=13) then
   addStep:=2;
if (ord(pc[curIndex])=10) then
   addStep:=1;
//处理有回车的
if (addStep<>0) then
begin
   Application.ProcessMessages;
   //增加一行记录
   inc(enterCount);
   //判断是否需要增大数组
   if (enterCount mod perArray=0) then
   begin
      arrayCount:=arrayCount+perArray;
      setlength(rArray,arrayCount);
   end;
   rArray[enterCount]:=curIndex+addStep;
   curIndex:=curIndex+addStep+RowLeast;
end
else
   curIndex:=curIndex+2;
if curIndex> size then
   DoLoop:=false
else
   DoLoop:=true;
  end;
  result:=rArray;
  freeandnil(MyFile);
end;

执行代码：

procedure TMainForm.btn2Click(Sender: TObject);
var
  datasIndex:TInt64Array;//数据文件索引
begin

  t1:=GetTickCount;
  datasIndex:=ScanEnterFile('R:\201201_dataFile.txt');
  Caption:=Caption+'::'+inttostr(GetTickCount-t1);
end;

执行结果是：16782 ms

下面是C#的代码：

      /// <summary>
      /// 扫描文本文件，进行行数的统计，并返回每一行的开始指针数组(1.2KW数据速度比使用数组的快10秒)
      /// </summary>
      /// <param name="fileName">文件名</param>
      /// <param name="rowCount">行数</param>
      /// <param name="rowLeast">一行最小长度</param>
      /// <param name="incCount">递增索引数组数量</param>
      /// <param name="initCount">首次初始化行索引数量</param>
      /// <returns>索引列表</returns>
      public static IList<long> ScanEnterFile(string fileName, out int rowCount, int rowLeast,ThreadProgress progress)
      {
         rowCount = 0;
         if (string.IsNullOrEmpty(fileName))
            return null;
         if (!System.IO.File.Exists(fileName))
            return null;
         FileStream myFile = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read, 8);//把文件读入流
         IList<long> rList=new List<long>();
         int enterCount = 0;//回车数量
         int checkValue;
         int addStep;
         myFile.Position = rowLeast;
         checkValue = myFile.ReadByte();
         while (checkValue != -1)
         {
            //Application.DoEvents();
            addStep = -1;
            //由于文件ReadByte之后，其当前位置已经往后推移了移位。
            //因此，如果是回车的第一个字符，则要推移一位。
            //而如果是回车的第二个字符，则不用推移一位
            if (checkValue == 13)
                  addStep = 1;
            else if (checkValue == 10)
                  addStep = 0;
            if (addStep >= 0)
            {
                  enterCount++;
                  rList.Add(myFile.Position + addStep);
                  myFile.Seek(rowLeast + addStep, SeekOrigin.Current);
                  progress(enterCount);
            }
            else myFile.Seek(2, SeekOrigin.Current);
            checkValue = myFile.ReadByte();
         }
         rowCount = enterCount + 1;
         return rList;
      }

执行的代码：

         Stopwatch stopwatch = new Stopwatch();
         stopwatch.Start();
         int rowCount;
         FileHelper.ScanEnterFile(@"R:\201201_dataFile.txt", out rowCount, 35, outputProgress);
         useTime = stopwatch.ElapsedMilliseconds;

执行结果是：
124925  ms
经过测试，C#的使用IList<T>比数组的要快大概是10秒。
总结：任何事物都有其存在的价值，至于看官门选什么，就根据自己的需要，来选择，这里，本人不会有任何偏向于哪一方。反正，能成事，什么都不重要了。
本文链接

		自动登录	找回密码
密码			用户注册