Huge array of variable length strings - Printable Version +- QB64 Phoenix Edition (https://qb64phoenix.com/forum) +-- Forum: Chatting and Socializing (https://qb64phoenix.com/forum/forumdisplay.php?fid=11) +--- Forum: General Discussion (https://qb64phoenix.com/forum/forumdisplay.php?fid=2) +--- Thread: Huge array of variable length strings (/showthread.php?tid=3130) |
Huge array of variable length strings - mdijkens - 10-17-2024 For a project I need to store an array of variable length strings. Let's say Code: (Select All)
But the issue is that the string lengths could vary from several bytes up to 2 GBCode: (Select All)
As soon as the arrays total size is above a couple of GB it aborts the program...I'd like to find a way to make max use of internal memory (>=32GB) What would be the best approach to define this? I think _Mem is not very suitable for variable length strings I could do one big _Mem and keep track of indexes/blocks but that's complicating the code quite a bit Any better suggestions? RE: Huge array of variable length strings - ahenry3068 - 10-17-2024 I have some ideas but it depends on the application. Questions: What do these strings represent (Files, A text buffer in an editor, ?? etc) Why do you want to load them all at once ? I'm thinking a more 'C' like approach where your array is actually an array of pointers then you write a couple of SUBS to allocate and deallocate _MEM for each pointer. Then a Cleanup SUB to free all the _MEM's RE: Huge array of variable length strings - mdijkens - 10-17-2024 (10-17-2024, 11:11 AM)ahenry3068 Wrote: I have some ideas but it depends on the application.I'm reading the contents of a directory with files to do a lot of searches on this content and report back which files have matches Search terms are not known upfront but depend on content/dependencies of these files, so I can't do the searches file by file... I am also thinking of an array of pointers to one big _Mem that I load all contents in, but I'm also curious what 'normal' variable structures can hold the biggest set of variable length strings? Are there max size differences between variable/fixed length, arrays, shared/no-shared, dynamic/static, user defined types, etc... RE: Huge array of variable length strings - luke - 10-17-2024 By all rights this should work fine as long as you have a reasonable amount of memory (which, as you say, you do). I can consistent reproduce the crash after 32 loop iterations, and at a glance in the debugger this looks like a QB64 bug. RE: Huge array of variable length strings - SMcNeill - 10-17-2024 If I remember right, there's some internal logic that bugs out at around the same limit as a LONG variable type. (2GB of memory usage, or so) The only time I've ever successfully used larger batches of memory like this, it's always been via a _MEM structure. RE: Huge array of variable length strings - ahenry3068 - 10-17-2024 (10-17-2024, 11:42 AM)luke Wrote: By all rights this should work fine as long as you have a reasonable amount of memory (which, as you say, you do). I can consistent reproduce the crash after 32 loop iterations, and at a glance in the debugger this looks like a QB64 bug. The most likely cause of such a bug is some one using a signed 32 bit integer in the code when they should be using an unsigned 32 bit integer or a 64 bit integer. The clue is it happens at 2gb (which is the largest positive value of a signed 32 bit integer). I ran into a bug in the _SND sub system that hits on the same limitation. _SNDSETPOS will fail on Wave files > 2gb's in size. RE: Huge array of variable length strings - luke - 10-17-2024 Yep. The size of the string allocation area (i.e. all current string allocations) is tracked in an unsigned 32 bit value. I'll see about changing that to a size_t or similar. RE: Huge array of variable length strings - mdijkens - 10-17-2024 As a test, I created the following which works (Of course _ReadFile$() only works for files up to 2GB, but I already have a filereader function with no limit, so for testing it's okay) Code: (Select All)
What would now be the fastest way to textsearch _Mem? There's no _MemSearch or something .... RE: Huge array of variable length strings - mdijkens - 10-17-2024 Hmmm, this one also aborts above 3.5GB of files .... It's the _ReadFile$() which goes wrong after 10000+ files with combined size >3GB Code: (Select All)
I think that's a bug! RE: Huge array of variable length strings - mdijkens - 10-17-2024 I think it is the c$ assignment where memory gets corrupted: Code: (Select All)
Second assignment aborts program... It seems when going above 1GB it sooner or later aborts |