Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Issue saving across VPN
#11
And <BOOM> goes the dynamite!  Yep - that worked like a charm.  Soooo fast now.  Seat of the pants - it feels about as fast as copying the file directly.

Thank you for the fix and thank you for the quick turnaround and thank you to your dedication to this project.  Well you and all of the others that make this a success and keep it going.  You guys all rock !!!

dano
Reply
#12
(11-07-2023, 06:29 PM)dano Wrote: And <BOOM> goes the dynamite!  Yep - that worked like a charm.  Soooo fast now.  Seat of the pants - it feels about as fast as copying the file directly.

Thank you for the fix and thank you for the quick turnaround and thank you to your dedication to this project.  Well you and all of the others that make this a success and keep it going.  You guys all rock !!!

dano
Wow, that was a fast solution. Yes, they rock!
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Reply
#13
Quality of life fixes are always the best ones. Just little things that make something a tiny bit faster or easier. Love it. Binary is the way to go.
Tread on those who tread on you

Reply
#14
(11-07-2023, 07:37 PM)SpriggsySpriggs Wrote: Quality of life fixes are always the best ones. Just little things that make something a tiny bit faster or easier. Love it. Binary is the way to go.

I doubt it's the change over to BINARY which causes the speed increase -- after all, behind the scenes, *all* file writing methods end up translating to the same C source routines.  The difference here is from writing the data one line at a time, which the VPN then does data/virus checking on one line at a time, verses writing all the data at once and only having a single run of the data/virus checker. 

I'd imagine that there's probably a system setting somewhere that one could play around with and tweak to whitelist the VPN directory to stop virus checking, and another setting elsewhere to turn off data transfer verification, but those aren't anything which we could actually change for an user.  The switch from multiple writes to a single write is easy enough, and it seems to solve the issue here, so I've already pushed the change into the repo for the other devs to look over and review.  If it all passes muster, then the next version of QB64PE should start doing file writes all at once for us, rather than line-by-line as it does currently.  Smile
Reply
#15
Right. Writing all of it at one time rather than one line at a time. That would be an obvious speed increase even if they are the same on the backend. Ordering 50 cables on Amazon 1 time is faster than ordering 1 cable 50 times.
Tread on those who tread on you

Reply
#16
https://github.com/QB64-Phoenix-Edition/QB64pe/pull/407 -- Changes pushed into the repo.  Future versions, when we release them, will have this fix patched into them so it shouldn't be an issue for you, from here on out.
Reply
#17
(11-07-2023, 07:45 PM)SMcNeill Wrote:
(11-07-2023, 07:37 PM)SpriggsySpriggs Wrote: Quality of life fixes are always the best ones. Just little things that make something a tiny bit faster or easier. Love it. Binary is the way to go.

I doubt it's the change over to BINARY which causes the speed increase -- after all, behind the scenes, *all* file writing methods end up translating to the same C source routines.  The difference here is from writing the data one line at a time, which the VPN then does data/virus checking on one line at a time, verses writing all the data at once and only having a single run of the data/virus checker. 

I'd imagine that there's probably a system setting somewhere that one could play around with and tweak to whitelist the VPN directory to stop virus checking, and another setting elsewhere to turn off data transfer verification, but those aren't anything which we could actually change for an user.  The switch from multiple writes to a single write is easy enough, and it seems to solve the issue here, so I've already pushed the change into the repo for the other devs to look over and review.  If it all passes muster, then the next version of QB64PE should start doing file writes all at once for us, rather than line-by-line as it does currently.  Smile
Bear in mind that I tried this with the antivirus disabled and it still had the same result.

Because of the way an IPsec VPN works, there is a penalty for transferring the same data in smaller chunks vs 1 single large chunk.  There is overhead for each transfer.  When you have additional transfers, it takes additional time that the single transfer does not have.  I believe this is where (at least part) the difference in transfer comes from.

It works now and thank you.  Funny story, years ago this issue caused a critical file to become deleted (was not backed up...that is another story) and although the rewrite took 6 months, it brought about changes that ultimately were needed in another project and in the end made a very lucrative contract possible!  Sometimes there is a diamond to be found in a turd!
Reply
#18
@dano Could you give the following code a test run and see how it works on your network device?   Just change the filenames to your network drive (don't forget to clean them up afterwards), and let me know what the time results are for you.

Note:  This may take a while, if a 4000 line program takes 5+ minutes for you.  You might want to let this run while taking a break for lunch, or sleeping overnight.  Tongue

Code: (Select All)
crlf$ = CHR$(13) + CHR$(10)

OPEN "temp.txt" FOR OUTPUT AS #1
OPEN "temp2.txt" FOR BINARY AS #2
OPEN "temp3.txt" FOR BINARY AS #3
OPEN "temp4.txt" FOR BINARY AS #4

a$ = "This is just one long line of junk to test some basic write times."
FOR i = 1 TO 10
    a$ = a$ + a$
NEXT

PRINT LEN(a$)
END

t# = TIMER
FOR j = 1 TO 10000
    PRINT #1, a$
NEXT
t1# = TIMER

FOR j = 1 TO 10000
    PUT #2, , a$
    PUT #2, , crlf$
NEXT
t2# = TIMER


FOR j = 1 TO 1000
    b$ = b$ + a$ + crlf$
NEXT
PUT #3, , b$
t3# = TIMER

FOR i = 1 TO 1000
    c$ = ""
    FOR j = 1 TO 10
        c$ = c$ + a$ + crlf$
    NEXT
    PUT #4, , c$
NEXT
t4# = TIMER



CLOSE
PRINT USING "#####.##### seconds with OUTPUT"; t1# - t#
PRINT USING "#####.##### seconds with BINARY"; t2# - t1#
PRINT USING "#####.##### seconds with BINARY (join/1 write) (estimated as the loops were 1/10th the size)"; (t3# - t2#) * 100
PRINT USING "#####.##### seconds with BINARY (batch join/write)"; t4# - t3#
Reply
#19
Absolutely, happy to.  I see that line 7 [End] was there to see if I was paying attention, LOL !!!  I was not!

Below are the results:

1087.41797 seconds with OUTPUT
404.17578 seconds with BINARY
12329.68780 seconds with BINARY (join/1 write) (estimated as the loops were 1/10 the size)
255.98828 seconds with BINARY (batch join/write)
Reply
#20
Those results are more-or-less about what I expected.   Let me break them down for you, and I'll explain what's going on with the values -- and why you might want to tweak your personal version of QB64 to suit your own specific needs.

1000 seconds for OUTPUT and PRINT.   This is just a basic loop which uses PRINT to print the data line by line.  It's our baseline time, and what QB64 uses currently.

400 seconds for PUT and BINARY.  This uses PUT to place the data line by line, more or less just like the above -- but it's over twice as fast.

12,300 seconds for PUT with one single write.   This is basically what I showed you above, that you said zoomed so speedily on your drive.  As long as your programs are short, and there's little data to place into a single line, this is the method which will give you the best write speed.   UNFORTUNATELY, as you can see from this test, when the files start getting longer and the program line count grows, the times here bloat exponentially.  String manipulation is a convoluted and slow process, and even though this method has the fastest WRITE times, it has a very, very long BUILD time where it assembles all those various lines into one massive line to write all at once.

255 second for PUT with writing in batches.  This is a middle ground method where we assemble a small number of lines together and then PUT them to disk in batches.  It's probably the optimal method for your use on your network, if you have files which grow to any sort of size whatsoever.  



As long as your programs are short, you'll probably find that the method I shared earlier is going to be the best for you.  These lines are massive at about 67,000 characters per line, and in the end we end up writing over 60MB into the save file for the program -- much more than what most folks will ever have for a BAS file!!

But, there's that trade off that comes with the string concatenation with large strings.   Once the program gets past X number of lines, any speed you save from that single write is going to be lost due to the massive overhead that comes from assembling all those lines into one.

For QB64PE itself, we're probably going to end up going with the simple PUT line-by-line method for everyone.  It means that save times will be reduced by more than half for everyone -- including your network saves.  Instead of a 9-minute save, I'd imagine the files would end up taking about 3 or 4.

You'll just have to decide for yourself what works best in your specific case.  For us, we can't ever be certain how long an user's program might be, so the "add it all into one string and do it in a single write" method isn't viable as a general use method for the masses.   The times involved with it are about a dozen times LONGER than what you saw originally, with massively long BAS files, and if you found a 9 minute save unbearable, think of how a 200 hour save would be!!

By delving deeper into this, we've shown we can improve save speeds by half, without having the issue of slowing things down inadvertently with much longer source files.  That's not a bad accomplishment to tuck under the belt at the end of the day!

In your specific case however, you need to weigh the following factors:

1) Is file size an issue?
2) How great is the network delay.
3) Which is the way to get the best results given the size files you normally work with, the time it takes to assemble them into a single string, and then write them to the disk.

Small files?   Write them all at once.   The overhead is small in adding the lines together and the time saved in the write is worth it.
Very large files?   Probably best to write them in batches of 10 to 25 lines in a single pass.  Don't let the string you're writing grow overly long, but reduce the number of writes to the drive as much as possible.

For general use -- just PUT them line by line and improve speeds considerably.
For your specific network use?  It all depends on the sizes involved and the optimal number of merges with their overhead attached, compared to the overhead for the file writing itself.  Smile
Reply




Users browsing this thread: 2 Guest(s)