Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Performance improvement splitting huge file
#2
(10-21-2023, 01:36 PM)mdijkens Wrote: This is a simplified part of a more complex process to split 1 huge inFile into multiple smaller outFile() ones:

Code: (Select All)
  Dim As _Unsigned _Byte inPos, splitFiles, splitFile
  Dim As _Unsigned _Integer64 inSize, splitPos, splitSize
  Dim As _Unsigned _Byte inFile(1 To inSize), char
  Get #1, 1, inFile()
  Dim As String outFile(splitFiles)
  For splitFile = 1 To splitFiles
    outFile(splitFile) = String$(splitSize, 0)
  Next splitFile

  For splitPos = 1 To splitSize
    For splitFile = 1 To splitFiles
      inPos = inPos + 1
      If inPos <= inSize Then
        char = inFile(inPos)
        Mid$(outFile(splitFile), splitPos, 1) = Chr$(char)
      End If
    Next splitFile
  Next splitPos
  For splitFile = 1 To splitFiles
    Put #splitFile%, , outFile(splitFile)
  Next splitFile
inFile is the byte-array of the inputfile
inSize is the size in bytes of the inFile
inPos is the current characterposition of the inFile
outFile() are the strings build for the split-files
splitFiles is the number of files to split into
splitSize is the size in bytes of each outFile (e.g. roundup(inSize/splitFiles))
splitFile is current splitFile
splitPos is the current characterposition of the outFile

Above works, but variable length strings and the Mid$() command are very time-expensive (2GB inFile takes >10 minutes)

I've tried 2-dimensional byte-arrays for the out-files like outFile(files, length) , but QB64 does not support Put with one dimension like Put #x, , outFile(x)
I've also tried mapping this 2-dimensional array with _MEM but did not succeed so far.

Does anyone have a clever trick to speed this up?

Binary access, not clever but no splitting needed, just clever coding ;-)) You are not saying why you think you need to split.
  724  855  599  923  575  468  400  206  147  564  878  823  652  556 bxor cross forever
Reply


Messages In This Thread
RE: Performance improvement splitting huge file - by bplus - 10-21-2023, 01:40 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  performance drop on LINUX with PSET in v4.2 Herve 12 971 11-20-2025, 02:00 PM
Last Post: SpriggsySpriggs
  for performance, what's the best variable type to use for boolean _TRUE & _FALSE ? madscijr 12 1,256 09-29-2025, 02:59 PM
Last Post: dakra137

Forum Jump:


Users browsing this thread: 1 Guest(s)