Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Tokenizer in QB64
#11
so my question is why QB64 don't recognize string variable defined As String
and know quoted "text"
Reply
#12
(02-25-2023, 03:43 PM)aurel Wrote: but why i get syntax error with simple function call here:

'call function tokenizer()

tokenizer(test)

If you have to call that you have to do something like this:

Code: (Select All)
dummy = tokenizer(test)

Because QB64(PE) isn't as lenient as Freebasic and other languages allowing a function call that "throws away" the result. M$ QuickBASIC design resisted being like C/C++ in allowing function results to be ignored as "side-effect", because in those other languages everything is a function, including the one called "main()" which has to be in every single program.

If you have to give a data type to a function at all, shouldn't that remind you that you need a variable of the same type for LHS, and that there should be a LHS? Otherwise declare that "function" as a SUB and leave out the "outer" parenthesis required for a function call. Like this:

Code: (Select All)
tokenizer test

It could look more confusing for somebody not used to programming this way, or he/she is used to other BASIC dialects.

(02-25-2023, 04:47 PM)aurel Wrote: and i do as string variable which is of course declared with Dim

'call function tokenizer()

tokenizer&(test)

but he called function using quoted string
like :
accept&("text")

You need a long integer variable on LHS and call it:

Code: (Select All)
DIM test AS STRING
dummy& = tokenizer(test)

or

Code: (Select All)
DIM dummy AS LONG, test AS STRING
dummy = tokenizer(test)

The "accept&()" is the same thing. Just replace "tokenizer" with "accept" in the examples above. Functions require LHS in QB64(PE). If this is not acceptable, and you don't want or need to declare another variable only to hold an "useless" function result you should redefine that "function" as a SUB.
Reply
#13
mnv

i know all what you saying but only don't know this:

Code: (Select All)
dummy = tokenizer(test)

so everything else is fine...not all compiler "think" that everything is a function , i used to use o2 and there is more strange
thing in o2 ..it is same SUB or FUNCTION both can return itself as variable value.
Reply
#14
hello
I finished translation of my tokenizer from o2 to QB64
whole code compile fine
but when i run program i get ILLEGAL FUNCTION CALL
and the weird thing it is on LINE 112

BUT there is no any function call on line 112 ...anyone ?
Code: (Select All)
'tokenizer in QB64pe (fb) by Aurel

declare function tokenizer&( src as string)
'declare function run_tokenizer(inputCode as string) as integer

const tkNULL=0, tkPLUS=1, tkMINUS=2, tkMULTI=3, tkDIVIDE=4
const tkCOLON=5, tkCOMMA=6, tkLPAREN=7, tkRPAREN=8, tkLBRACKET=9, tkRBRACKET=10
const tkIDENT = 11 , tkNUMBER = 12 , tkQSTRING = 13, tkCOMMAND =14 ,tkEOL = 15
const tkEQUAL = 16, tkMORE = 17, tkLESS = 18, tkAND = 19, tkOR = 20, tkNOT = 21
const tkHASH=22 , tkSSTR=23, tkMOD=24 , tkSEMI=25, tkDOT=26, tkLBRACE=27, tkRBRACE=28
const  tkQUEST=29, tkMONKEY=30 , tkBACKSLAH=31, tkPOWUP=32 ,tkAPOSTR=33 , tkTILDA=34

Dim shared tokList(1024)  As string                       'token array
Dim shared typList(1024)  As integer                      'token type array
Dim shared p              As Long : p=1
Dim shared start          as Long : start = 1
Dim shared tp             as long
Dim shared tn             as long
Dim shared n              as long
Dim shared ltp            as long  : lpt = 1
Dim shared nTokens        As long                            'nTokens -> number of tokens
Dim shared lineCount      As integer
Dim shared Lpar           as integer
Dim shared Rpar           as integer
Dim shared Lbrk           as integer
Dim shared Rbrk           as integer
Dim shared tokerr         as integer
Dim shared codeLen        as integer
Dim shared code           As String
Dim shared chs            As String
Dim shared tch            As String
Dim shared tk             As String
Dim shared crlf           As String
Dim shared bf             As String
Dim shared ntk            As String
Dim shared ch             As String
crlf = chr$(13) + chr$(10)
'test string .......................................................
Dim test as string  : test = "func tokenizer in QB64"
'...................................................................


tn = tokenizer&(test)
'print result on screen...
PRINT "Number of tokens: " + str$(tn)
PRINT "Number of lines: " + str$(lineCount)
nTokens = tn


' *** MAIN TOKENIZER FUNCTION ***
FUNCTION tokenizer& (src as string )
print "tokenizer run:" + src
lineCount=0:ltp=start : nTokens = 0

'Main Tokenizer Loop.....................................
WHILE p <= len(src)
'------------------------------------
ch = Mid$(src,p,1)   ' get char
'------------------------------------
If Asc(ch)=32 Then  p=p+1        ' skip blank space[ ]
If Asc(ch)=9  Then  p=p+1        ' skip TAB [    ]
If Asc(ch)=13 Then  p=p+1        ' skip CR -> goto Again -> NewLine

if asc(ch)=39  Then                ' skip comment line[ ' ]                                                       
    while asc(ch) <> 10
      p=p+1 : ch = mid$(src,p,1)
      if asc(ch)= 10 OR asc(ch) = 0  THEN Exit While
    wend
   lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" : typList(tp)= tkEOL : tk="": ch=""  ' add EOL on comment end
   p=p+1 : goto endLoop                                                           ' jump to LABEL -> end of loop
end if

If asc(ch)=10  Then                                                    ' EOL
   if Lpar > Rpar  Then
      tokerr=3  : goto tokExit
   end if          ' if Rparen ((...)
   if Lpar < Rpar  Then
      tokerr=4  : goto tokExit
   end if             'if Lparen (...))
   if Lbrk > Rbrk  Then   
      tokerr=5  : goto tokExit
   end if              ' if Lbracket [..
   if Lbrk < Rbrk  Then   
      tokerr=6  : goto tokExit
   end if              ' if Rbracket ...]
lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" :typList(tp)= tkEOL: tk="": ch="" : p=p+1
End if

'--------------------------------------------------------
If asc(ch)=34  Then                                                        ' if char is QUOTE "
  p=p+1 :  ch = Mid$(src,p,1) : tk=ch : p=p+1                                ' skip quote :add ch TO tk buffer: p+1
    while asc(ch) <> 34       
       ch = Mid$(src,p,1) : if asc(ch)= 34 then exit while
        tk=tk+ch : p=p+1
        IF ch = chr$(10) Then
           tokerr = 2: goto tokExit
        END IF
    wend
    tp=tp+1 : tokList(tp)= tk :typList(tp)= tkQSTRING: tk="":ch="": p=p+1    ' add quoted string to token list
End if

'-------------------------------------------------------           
If (asc(ch)>96 and asc(ch)<123) or (asc(ch)>64 and asc(ch)<91) or asc(ch)=95  Then                                      ' [a-z,A-Z_]
   while (asc(ch)>96 and asc(ch)<123) or  (asc(ch)>64 and asc(ch)<91) or (asc(ch)>47 and asc(ch)<58) or asc(ch)=95   ' [a-z,A-Z,0-9_]
         tk=tk+ch : p=p+1 : ch = mid$(src,p,1)
   wend
      ' ' add token ,add token type/IDENT:{VAR/COMMAND}
       tp=tp+1 : tokList(tp) = tk :typList(tp)= tkIDENT: tk="":ch=""       
End If

'--------------------------------------------------------------
If (asc(ch)>47 and asc(ch)<58) Then                                     ' [0-9.]
    while (asc(ch)>47 AND asc(ch)<58) OR asc(ch)=46                   ' [0-9[0.0]]*
        tk=tk+ch :p=p+1 : ch = mid$(src,p,1)
    wend
       ' add token ,add token type/NUMBER
       tp=tp+1 : tokList(tp) = tk : typList(tp)= tkNUMBER: tk="":ch=""
End if

'-------------------------------------------------------------------------
If asc(ch)=43 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkPLUS:  ch="" : p=p+1
End If

If asc(ch)=45 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMINUS:  ch="" : p=p+1
End If

If asc(ch)=42 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMULTI:  ch="" : p=p+1
End If

If asc(ch)=47 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDIVIDE:  ch="" : p=p+1
End If

If asc(ch)=40 then  tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLPAREN:  ch="" : p=p+1 : Lpar=Lpar+1

If Asc(ch)=41 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRPAREN:  ch="" : p=p+1 : Rpar=Rpar+1   ' ) Rparen
If Asc(ch)=44 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOMMA:   ch="" : p=p+1               ' , comma
If Asc(ch)=58 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOLON:   ch="" : p=p+1               ' : colon
If Asc(ch)=59 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSEMI :   ch="" : p=p+1             ' ; semi_colon
If Asc(ch)=60 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLESS:    ch="" : p=p+1               ' < less
If Asc(ch)=61 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkEQUAL:   ch="" : p=p+1               ' = equal
If Asc(ch)=62 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMORE:    ch="" : p=p+1               ' > more(greater)
If Asc(ch)=63 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkQUEST:    ch="" : p=p+1                 ' > questMark ?
If Asc(ch)=64 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMONKEY:    ch="" : p=p+1            ' > at(monkey) @

If Asc(ch)=91 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACKET:ch="" : p=p+1 : Lbrk=Lbrk+1   ' ( Lbracket
If Asc(ch)=92 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkBACKSLAH:ch="" : p=p+1 : :             ' \ backSlash
If Asc(ch)=93 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACKET:ch="" : p=p+1 : Rbrk=Rbrk+1   ' ) Rbracket

If Asc(ch)=94 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkPOWUP:    ch="" : p=p+1           ' ^ power up
If Asc(ch)=96 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAPOSTR:   ch="" : p=p+1            ' ` apoStrophe
If Asc(ch)=38 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAND:      ch="" : p=p+1            ' & AND
If Asc(ch)=124 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkOR:       ch="" : p=p+1             ' | OR
If Asc(ch)=33 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkNOT:      ch="" : p=p+1            ' ! NOT
If Asc(ch)=35 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkHASH:     ch="" : p=p+1            ' # hash
If Asc(ch)=36 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSSTR:     ch="" : p=p+1            ' $ $TRING
If Asc(ch)=37 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMOD :     ch="" : p=p+1            ' % percent/MOD
If Asc(ch)=46 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDOT :     ch="" : p=p+1            ' . dot/point
If Asc(ch)=123 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACE  :ch="" : p=p+1           ' { LBrace
If Asc(ch)=125 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACE  :ch="" : p=p+1           ' } RBrace
If Asc(ch)=126 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkTILDA   :ch="" : p=p+1           ' ~ tilda

IF ASC(ch)>126  then tokerr = 1 : goto tokExit
IF ASC(ch)<28   then tokerr = 1
IF ASC(ch)=0    then tokerr = 0
IF ASC(ch)=9    then tokerr = 0
IF ASC(ch)=10   then tokerr = 0
IF ASC(ch)=13   then tokerr = 0
IF tokerr = 1   then goto tokExit


endLoop:
WEND
tokenizer = tp
If tp <> 0 then EXIT FUNCTION

tokExit:
  IF tokerr > 0 THEN
    if tokerr = 1 then PRINT "Unknown token!-[ " + ch +" ] at LINE: " + str$(lineCount)
    if tokerr = 2 then PRINT "Unclosed Quote!- at LINE: "             + str$(lineCount)
    if tokerr = 3 then PRINT "Missing right paren! ((...)- at LINE: " + str$(lineCount)
    if tokerr = 4 then PRINT "Missing left paren! (...))- at LINE: "  + str$(lineCount)
    if tokerr = 5 then PRINT "Missing right bracket!- at LINE: "      + str$(lineCount)
    if tokerr = 6 then PRINT "Missing left bracket!- at LINE: "       + str$(lineCount)
    tokenizer& = 0 : EXIT FUNCTION
  END IF

tokenizer  = 0
End Function

'tokout& = tokenizer&(test)  ' if is called after function then cause error
Reply
#15
(11-21-2023, 04:13 PM)aurel Wrote: hello
I finished translation of my tokenizer from o2 to QB64
whole code compile fine
but when i run program i get ILLEGAL FUNCTION CALL
and the weird thing it is on LINE 112

BUT there is no any function call on line 112 ...anyone ?
Code: (Select All)
'tokenizer in QB64pe (fb) by Aurel

declare function tokenizer&( src as string)
'declare function run_tokenizer(inputCode as string) as integer

const tkNULL=0, tkPLUS=1, tkMINUS=2, tkMULTI=3, tkDIVIDE=4
const tkCOLON=5, tkCOMMA=6, tkLPAREN=7, tkRPAREN=8, tkLBRACKET=9, tkRBRACKET=10
const tkIDENT = 11 , tkNUMBER = 12 , tkQSTRING = 13, tkCOMMAND =14 ,tkEOL = 15
const tkEQUAL = 16, tkMORE = 17, tkLESS = 18, tkAND = 19, tkOR = 20, tkNOT = 21
const tkHASH=22 , tkSSTR=23, tkMOD=24 , tkSEMI=25, tkDOT=26, tkLBRACE=27, tkRBRACE=28
const  tkQUEST=29, tkMONKEY=30 , tkBACKSLAH=31, tkPOWUP=32 ,tkAPOSTR=33 , tkTILDA=34

Dim shared tokList(1024)  As string                       'token array
Dim shared typList(1024)  As integer                      'token type array
Dim shared p              As Long : p=1
Dim shared start          as Long : start = 1
Dim shared tp             as long
Dim shared tn             as long
Dim shared n              as long
Dim shared ltp            as long  : lpt = 1
Dim shared nTokens        As long                            'nTokens -> number of tokens
Dim shared lineCount      As integer
Dim shared Lpar           as integer
Dim shared Rpar           as integer
Dim shared Lbrk           as integer
Dim shared Rbrk           as integer
Dim shared tokerr         as integer
Dim shared codeLen        as integer
Dim shared code           As String
Dim shared chs            As String
Dim shared tch            As String
Dim shared tk             As String
Dim shared crlf           As String
Dim shared bf             As String
Dim shared ntk            As String
Dim shared ch             As String
crlf = chr$(13) + chr$(10)
'test string .......................................................
Dim test as string  : test = "func tokenizer in QB64"
'...................................................................


tn = tokenizer&(test)
'print result on screen...
PRINT "Number of tokens: " + str$(tn)
PRINT "Number of lines: " + str$(lineCount)
nTokens = tn


' *** MAIN TOKENIZER FUNCTION ***
FUNCTION tokenizer& (src as string )
print "tokenizer run:" + src
lineCount=0:ltp=start : nTokens = 0

'Main Tokenizer Loop.....................................
WHILE p <= len(src)
'------------------------------------
ch = Mid$(src,p,1)   ' get char
'------------------------------------
If Asc(ch)=32 Then  p=p+1        ' skip blank space[ ]
If Asc(ch)=9  Then  p=p+1        ' skip TAB [    ]
If Asc(ch)=13 Then  p=p+1        ' skip CR -> goto Again -> NewLine

if asc(ch)=39  Then                ' skip comment line[ ' ]                                                       
    while asc(ch) <> 10
      p=p+1 : ch = mid$(src,p,1)
      if asc(ch)= 10 OR asc(ch) = 0  THEN Exit While
    wend
   lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" : typList(tp)= tkEOL : tk="": ch=""  ' add EOL on comment end
   p=p+1 : goto endLoop                                                           ' jump to LABEL -> end of loop
end if

If asc(ch)=10  Then                                                    ' EOL
   if Lpar > Rpar  Then
      tokerr=3  : goto tokExit
   end if          ' if Rparen ((...)
   if Lpar < Rpar  Then
      tokerr=4  : goto tokExit
   end if             'if Lparen (...))
   if Lbrk > Rbrk  Then   
      tokerr=5  : goto tokExit
   end if              ' if Lbracket [..
   if Lbrk < Rbrk  Then   
      tokerr=6  : goto tokExit
   end if              ' if Rbracket ...]
lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" :typList(tp)= tkEOL: tk="": ch="" : p=p+1
End if

'--------------------------------------------------------
If asc(ch)=34  Then                                                        ' if char is QUOTE "
  p=p+1 :  ch = Mid$(src,p,1) : tk=ch : p=p+1                                ' skip quote :add ch TO tk buffer: p+1
    while asc(ch) <> 34       
       ch = Mid$(src,p,1) : if asc(ch)= 34 then exit while
        tk=tk+ch : p=p+1
        IF ch = chr$(10) Then
           tokerr = 2: goto tokExit
        END IF
    wend
    tp=tp+1 : tokList(tp)= tk :typList(tp)= tkQSTRING: tk="":ch="": p=p+1    ' add quoted string to token list
End if

'-------------------------------------------------------           
If (asc(ch)>96 and asc(ch)<123) or (asc(ch)>64 and asc(ch)<91) or asc(ch)=95  Then                                      ' [a-z,A-Z_]
   while (asc(ch)>96 and asc(ch)<123) or  (asc(ch)>64 and asc(ch)<91) or (asc(ch)>47 and asc(ch)<58) or asc(ch)=95   ' [a-z,A-Z,0-9_]
         tk=tk+ch : p=p+1 : ch = mid$(src,p,1)
   wend
      ' ' add token ,add token type/IDENT:{VAR/COMMAND}
       tp=tp+1 : tokList(tp) = tk :typList(tp)= tkIDENT: tk="":ch=""       
End If

'--------------------------------------------------------------
If (asc(ch)>47 and asc(ch)<58) Then                                     ' [0-9.]
    while (asc(ch)>47 AND asc(ch)<58) OR asc(ch)=46                   ' [0-9[0.0]]*
        tk=tk+ch :p=p+1 : ch = mid$(src,p,1)
    wend
       ' add token ,add token type/NUMBER
       tp=tp+1 : tokList(tp) = tk : typList(tp)= tkNUMBER: tk="":ch=""
End if

'-------------------------------------------------------------------------
If asc(ch)=43 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkPLUS:  ch="" : p=p+1
End If

If asc(ch)=45 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMINUS:  ch="" : p=p+1
End If

If asc(ch)=42 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMULTI:  ch="" : p=p+1
End If

If asc(ch)=47 Then
    tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDIVIDE:  ch="" : p=p+1
End If

If asc(ch)=40 then  tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLPAREN:  ch="" : p=p+1 : Lpar=Lpar+1

If Asc(ch)=41 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRPAREN:  ch="" : p=p+1 : Rpar=Rpar+1   ' ) Rparen
If Asc(ch)=44 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOMMA:   ch="" : p=p+1               ' , comma
If Asc(ch)=58 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOLON:   ch="" : p=p+1               ' : colon
If Asc(ch)=59 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSEMI :   ch="" : p=p+1             ' ; semi_colon
If Asc(ch)=60 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLESS:    ch="" : p=p+1               ' < less
If Asc(ch)=61 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkEQUAL:   ch="" : p=p+1               ' = equal
If Asc(ch)=62 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMORE:    ch="" : p=p+1               ' > more(greater)
If Asc(ch)=63 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkQUEST:    ch="" : p=p+1                 ' > questMark ?
If Asc(ch)=64 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMONKEY:    ch="" : p=p+1            ' > at(monkey) @

If Asc(ch)=91 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACKET:ch="" : p=p+1 : Lbrk=Lbrk+1   ' ( Lbracket
If Asc(ch)=92 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkBACKSLAH:ch="" : p=p+1 : :             ' \ backSlash
If Asc(ch)=93 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACKET:ch="" : p=p+1 : Rbrk=Rbrk+1   ' ) Rbracket

If Asc(ch)=94 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkPOWUP:    ch="" : p=p+1           ' ^ power up
If Asc(ch)=96 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAPOSTR:   ch="" : p=p+1            ' ` apoStrophe
If Asc(ch)=38 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAND:      ch="" : p=p+1            ' & AND
If Asc(ch)=124 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkOR:       ch="" : p=p+1             ' | OR
If Asc(ch)=33 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkNOT:      ch="" : p=p+1            ' ! NOT
If Asc(ch)=35 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkHASH:     ch="" : p=p+1            ' # hash
If Asc(ch)=36 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSSTR:     ch="" : p=p+1            ' $ $TRING
If Asc(ch)=37 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMOD :     ch="" : p=p+1            ' % percent/MOD
If Asc(ch)=46 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDOT :     ch="" : p=p+1            ' . dot/point
If Asc(ch)=123 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACE  :ch="" : p=p+1           ' { LBrace
If Asc(ch)=125 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACE  :ch="" : p=p+1           ' } RBrace
If Asc(ch)=126 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkTILDA   :ch="" : p=p+1           ' ~ tilda

IF ASC(ch)>126  then tokerr = 1 : goto tokExit
IF ASC(ch)<28   then tokerr = 1
IF ASC(ch)=0    then tokerr = 0
IF ASC(ch)=9    then tokerr = 0
IF ASC(ch)=10   then tokerr = 0
IF ASC(ch)=13   then tokerr = 0
IF tokerr = 1   then goto tokExit


endLoop:
WEND
tokenizer = tp
If tp <> 0 then EXIT FUNCTION

tokExit:
  IF tokerr > 0 THEN
    if tokerr = 1 then PRINT "Unknown token!-[ " + ch +" ] at LINE: " + str$(lineCount)
    if tokerr = 2 then PRINT "Unclosed Quote!- at LINE: "             + str$(lineCount)
    if tokerr = 3 then PRINT "Missing right paren! ((...)- at LINE: " + str$(lineCount)
    if tokerr = 4 then PRINT "Missing left paren! (...))- at LINE: "  + str$(lineCount)
    if tokerr = 5 then PRINT "Missing right bracket!- at LINE: "      + str$(lineCount)
    if tokerr = 6 then PRINT "Missing left bracket!- at LINE: "       + str$(lineCount)
    tokenizer& = 0 : EXIT FUNCTION
  END IF

tokenizer  = 0
End Function

'tokout& = tokenizer&(test)  ' if is called after function then cause error

Line 112:
IF (ASC(ch) > 47 AND ASC(ch) < 58) THEN ' [0-9.]

ASC is a function call and it throws the error if it gets an empty string as input, so you should check if your variable "ch" is non-empty before passing it to ASC.
Reply
#16
Thanks to Mark aka B+
on his asc2() wrapper function
tokenizer work well Wink

Code: (Select All)
'tokenizer in QB64pe (fb) by Aurel
'fix by B+ asc2()
declare function tokenizer&( src as string)
'declare function run_tokenizer(inputCode as string) as integer

const tkNULL=0, tkPLUS=1, tkMINUS=2, tkMULTI=3, tkDIVIDE=4
const tkCOLON=5, tkCOMMA=6, tkLPAREN=7, tkRPAREN=8, tkLBRACKET=9, tkRBRACKET=10
const tkIDENT = 11 , tkNUMBER = 12 , tkQSTRING = 13, tkCOMMAND =14 ,tkEOL = 15
const tkEQUAL = 16, tkMORE = 17, tkLESS = 18, tkAND = 19, tkOR = 20, tkNOT = 21
const tkHASH=22 , tkSSTR=23, tkMOD=24 , tkSEMI=25, tkDOT=26, tkLBRACE=27, tkRBRACE=28
const  tkQUEST=29, tkMONKEY=30 , tkBACKSLAH=31, tkPOWUP=32 ,tkAPOSTR=33 , tkTILDA=34

Dim shared tokList(1024)  As string                       'token array
Dim shared typList(1024)  As integer                      'token type array
Dim shared p              As Long : p=1
Dim shared start          as Long : start = 1
Dim shared tp             as long
Dim shared tn             as long
Dim shared n              as long
Dim shared ltp            as long  : lpt = 1
Dim shared nTokens        As long                            'nTokens -> number of tokens
Dim shared lineCount      As integer
Dim shared Lpar           as integer
Dim shared Rpar           as integer
Dim shared Lbrk           as integer
Dim shared Rbrk           as integer
Dim shared tokerr         as integer
Dim shared codeLen        as integer
Dim shared code           As String
Dim shared chs            As String
Dim shared tch            As String
Dim shared tk             As String
Dim shared crlf           As String
Dim shared bf             As String
Dim shared ntk            As String
Dim shared ch             As String
crlf = chr$(13) + chr$(10)
'test string .......................................................
Dim test as string  : test = "PRINT (a+b" + crlf
'...................................................................


tn = tokenizer&(test)
'print result on screen...
PRINT "Number of tokens: " + str$(tn)
PRINT "Number of lines: " + str$(lineCount)
nTokens = tn


' *** MAIN TOKENIZER FUNCTION ***
FUNCTION tokenizer& (src as string )
print "tokenizer run:" + src
lineCount=0:ltp=start : nTokens = 0

'Main Tokenizer Loop.....................................
WHILE p <= len(src)
'------------------------------------
ch = Mid$(src,p,1)   ' get char
'------------------------------------
If asc2(ch)=32 Then  p=p+1        ' skip blank space[ ]
If asc2(ch)=9  Then  p=p+1        ' skip TAB [    ]
If asc2(ch)=13 Then  p=p+1        ' skip CR -> goto Again -> NewLine

if asc2(ch)=39  Then                ' skip comment line[ ' ]                                                       
    while asc2(ch) <> 10
      p=p+1 : ch = mid$(src,p,1)
      if asc2(ch)= 10 OR asc2(ch) = 0  THEN Exit While
    wend
   lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" : typList(tp)= tkEOL : tk="" : ch=""  ' add EOL on comment end
   p=p+1 : goto endLoop                                                           ' jump to LABEL -> end of loop
end if

If asc2(ch)=10  Then                                                    ' EOL
   if Lpar > Rpar  Then
      tokerr=3  : goto tokExit
   end if          ' if Rparen ((...)
   if Lpar < Rpar  Then
      tokerr=4  : goto tokExit
   end if             'if Lparen (...))
   if Lbrk > Rbrk  Then   
      tokerr=5  : goto tokExit
   end if              ' if Lbracket [..
   if Lbrk < Rbrk  Then   
      tokerr=6  : goto tokExit
   end if              ' if Rbracket ...]
lineCount=lineCount+1 : tp=tp+1 : tokList(tp)="EOL" :typList(tp)= tkEOL: tk="": p=p+1 : ch=""
End if

'--------------------------------------------------------
If asc2(ch)=34  Then                                                        ' if char is QUOTE "
  p=p+1 :  ch = Mid$(src,p,1) : tk=ch : p=p+1                                ' skip quote :add ch TO tk buffer: p+1
    while asc2(ch) <> 34       
       ch = Mid$(src,p,1) : if asc2(ch)= 34 then exit while
        tk=tk+ch : p=p+1
        IF ch = chr$(10) Then
           tokerr = 2: goto tokExit
        END IF
    wend
    tp=tp+1 : tokList(tp)= tk :typList(tp)= tkQSTRING: tk="": p=p+1    : ch="" 'add quoted string to token list
End if

'-------------------------------------------------------           
If (asc2(ch)>96 and asc2(ch)<123) or (asc2(ch)>64 and asc2(ch)<91) or asc2(ch)=95  Then                             ' [a-z,A-Z_]
   while asc2(ch)>96 and asc2(ch)<123  or  asc2(ch)>64 and asc2(ch)<91 or asc2(ch)>47 and asc2(ch)<58 or asc2(ch)=95   ' [a-z,A-Z,0-9_]
         tk=tk+ch : p=p+1 : ch = mid$(src,p,1)
   wend
      ' ' add token ,add token type/IDENT:{VAR/COMMAND}
       tp=tp+1 : tokList(tp) = tk :typList(tp)= tkIDENT: tk=""  :ch =""     
End If

'--------------------------------------------------------------
If asc2(ch) > 47 and asc2(ch) < 58 Then                                     ' [0-9.]
    while (asc2(ch)>47 AND asc2(ch)<58) OR asc2(ch)=46                   ' [0-9[0.0]]*
        tk=tk+ch :p=p+1 : ch = mid$(src,p,1)
    wend
       ' add token ,add token type/NUMBER
       tp=tp+1 : tokList(tp) = tk : typList(tp)= tkNUMBER: tk="" : ch=""
End if

'-------------------------------------------------------------------------
If asc2(ch)=43 Then tp=tp+1 : tokList(tp) = ch : typList(tp)= tkPLUS : ch="" : p=p+1
If asc2(ch)=45 Then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMINUS:  p=p+1  : ch=""
If asc2(ch)=42 Then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMULTI: p=p+1   : ch=""
If asc2(ch)=47 Then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDIVIDE : p=p+1  : ch=""
If asc2(ch)=40 then  tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLPAREN:  ch="" : p=p+1 : Lpar=Lpar+1

If Asc2(ch)=41 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRPAREN:  ch="" : p=p+1 : Rpar=Rpar+1   ' ) Rparen
If Asc2(ch)=44 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOMMA:   ch="" : p=p+1               ' , comma
If Asc2(ch)=58 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkCOLON:   ch="" : p=p+1               ' : colon
If Asc2(ch)=59 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSEMI :   ch="" : p=p+1             ' ; semi_colon
If Asc2(ch)=60 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLESS:    ch="" : p=p+1               ' < less
If Asc2(ch)=61 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkEQUAL:   ch="" : p=p+1               ' = equal
If Asc2(ch)=62 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMORE:    ch="" : p=p+1               ' > more(greater)
If Asc2(ch)=63 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkQUEST:    ch="" : p=p+1                 ' > questMark ?
If Asc2(ch)=64 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMONKEY:    ch="" : p=p+1            ' > at(monkey) @

If Asc2(ch)=91 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACKET:ch="" : p=p+1 : Lbrk=Lbrk+1   ' ( Lbracket
If Asc2(ch)=92 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkBACKSLAH:ch="" : p=p+1 : :             ' \ backSlash
If Asc2(ch)=93 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACKET:ch="" : p=p+1 : Rbrk=Rbrk+1   ' ) Rbracket

If Asc2(ch)=94 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkPOWUP:    ch="" : p=p+1           ' ^ power up
If Asc2(ch)=96 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAPOSTR:   ch="" : p=p+1            ' ` apoStrophe
If Asc2(ch)=38 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkAND:      ch="" : p=p+1            ' & AND
If Asc2(ch)=124 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkOR:       ch="" : p=p+1             ' | OR
If Asc2(ch)=33 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkNOT:      ch="" : p=p+1            ' ! NOT
If Asc2(ch)=35 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkHASH:     ch="" : p=p+1            ' # hash
If Asc2(ch)=36 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkSSTR:     ch="" : p=p+1            ' $ $TRING
If Asc2(ch)=37 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkMOD :     ch="" : p=p+1            ' % percent/MOD
If Asc2(ch)=46 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkDOT :     ch="" : p=p+1            ' . dot/point
If Asc2(ch)=123 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkLBRACE  :ch="" : p=p+1           ' { LBrace
If Asc2(ch)=125 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkRBRACE  :ch="" : p=p+1           ' } RBrace
If Asc2(ch)=126 then tp=tp+1 : tokList(tp) = ch :typList(tp)= tkTILDA   :ch="" : p=p+1           ' ~ tilda

IF asc2(ch)>126  then tokerr = 1 : goto tokExit
IF asc2(ch)<28   then tokerr = 1
IF asc2(ch)=0    then tokerr = 0
IF asc2(ch)=9    then tokerr = 0
IF asc2(ch)=10   then tokerr = 0
IF asc2(ch)=13   then tokerr = 0
IF tokerr = 1   then goto tokExit


endLoop:
WEND
tokenizer = tp
If tp <> 0 then EXIT FUNCTION

tokExit:
  IF tokerr > 0 THEN
    if tokerr = 1 then PRINT "Unknown token!-[ " + ch +" ] at LINE: " + str$(lineCount)
    if tokerr = 2 then PRINT "Unclosed Quote!- at LINE: "             + str$(lineCount)
    if tokerr = 3 then PRINT "Missing right paren! ((...)- at LINE: " + str$(lineCount)
    if tokerr = 4 then PRINT "Missing left paren! (...))- at LINE: "  + str$(lineCount)
    if tokerr = 5 then PRINT "Missing right bracket!- at LINE: "      + str$(lineCount)
    if tokerr = 6 then PRINT "Missing left bracket!- at LINE: "       + str$(lineCount)
    tokenizer& = 0 : EXIT FUNCTION
  END IF

tokenizer  = 0
End Function

'tokout& = tokenizer&(test)  ' if is called after function then cause error

Function asc2& (ch$)
    If Mid$(ch$, 1, 1) <> "" Then asc2& = Asc(Mid$(ch$, 1, 1))
End Function


Attached Files Image(s)
   
Reply
#17
Well ZXDunny found the problem and Paul Doe had smarter check on ch$
so I would rewrite the wrapper like this;
Code: (Select All)
Function asc2& (ch$) ' if you just don't know what ch$ might be
    If Len(ch$) Then asc2& = Asc(ch$)
End Function

otherwise asc2&() returns 0 instead of shutting down the Run with an error.
b = b + ...
Reply
#18
Well..and well
Mark as i already posted on BASIC4US   forum
For me is main problem null terminated string or chr$(0) or this ""
And such a thing is solved or work in other Basic dialects
so it should be fixed in QB64 to
because produce ILLEGAL FUNCTION CALL runtime error

so
ch$ = "" .....or ch$ = chr$(0)
in ASC(ch) must work !

from all what i read on Discord and ZX think that know what is problem
PaulDoe give a trick with Len() but that not work
so your solution with asc2() work properly
Reply
#19
I do think the code is very un-QB64 and could be structured way better which prevents issues and makes it much more readable.

- Get rid of old basic DECLARE's, WHILE/WEND's, etc.
- Don't make all variables shared unless you have a good reason; use local variables in functions/subs as much as possible
- Use sensible variablenames
- separate small blocks of code in their own functions
- don't use goto TokExit; it's a sign of bad structure in your function
- structure with FOR, SELECT CASE, etc.

A lot of things can go wrong here. For example in your WHILE p <= Len(src) it is very risky to do p=p+1:ch=mid$(src,p,1) since you don't check if p is now pointing outside ch. And there are more of these issues hidden in the structure of your function

something like this (didn't look at all the details):
Code: (Select All)
Const FALSE = 0, TRUE = Not FALSE
Function tokenizer& (src As String)
  srclen& = Len(src)
  srcpos& = 0
  Do While srcpos& < srclen&
    srcpos& = srcpos& + 1
    char% = Asc(src, srcpos&) ' get char
    Select Case char%
      Case 32 'skip space
      Case 9 'skip tab
      Case 13 'skip cr
        skipline% = FALSE: lineCount = lineCount + 1
      Case 39 'skip commentline
        skipline% = TRUE
      Case 10 ' lf
        skipline% = FALSE: lineCount = lineCount + 1
        tp& = addToken("eol")
      Case 34 'double quote
        If Not skipline% Then
          If dquote& = 0 Then
            dquote& = srcpos&
          Else
            tp& = addToken("from dquote& to srcpos&")
            dquote& = 0
          End If
        End If
      Case 48 To 57 '0 to 9
        If Not skipline% Then tp& = addToken("digit")
      Case tkPLUS 'Plus
        If Not skipline% Then tp& = addToken("tkPlus")
      Case tkMINUS 'Minus
        If Not skipline% Then tp& = addToken("tkMINUS")
      Case tkETCETERA 'all other const tk...
        If Not skipline% Then tp& = addToken("something")
      Case 92 To 125 'all other valid chars?
        If Not skipline% Then tp& = addToken("something")
      Case Else
    End Select
  Loop
  tokenizer& = tp&
End Function

Function addToken& (something$)
  If something$ = "allright" then
    tokens = tokens + 1
    ' add token to array
    addToken& = tokens
  Else
    addToken& = errorcode ' Negative
  End If
End Function

Function tokError (errcode%)
  ' show and/or handle error
End Function
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#20
(11-22-2023, 10:11 AM)aurel Wrote: so
ch$ = "" .....or ch$ = chr$(0)
in ASC(ch) must work !

Actually ASC(CHR$(0)) works, it returns 0. But that's also the reason it can't work as ASC(""). What should it return? Zero maybe? That would be ambiguous in regard to ASC(CHR$(0)).

Also ASC returns the ASCII value of a char. If there is NO char as in ASC(""), then what's the ASCII value of this non-existing char?

Yes, sometimes you need to use your own brain, QB64 has no AI implemented in it Tongue Wink
Reply




Users browsing this thread: 3 Guest(s)