Audio storage, stereo switching

Audio storage, stereo switching - Printable Version

+- QB64 Phoenix Edition (https://qb64phoenix.com/forum)
+-- Forum: QB64 Rising (https://qb64phoenix.com/forum/forumdisplay.php?fid=1)
+--- Forum: Code and Stuff (https://qb64phoenix.com/forum/forumdisplay.php?fid=3)
+---- Forum: Programs (https://qb64phoenix.com/forum/forumdisplay.php?fid=7)
+---- Thread: Audio storage, stereo switching (/showthread.php?tid=1483)

Pages: 1 2

RE: Audio storage, stereo switching - Petr - 02-22-2025

If I understand correctly, you want to encode data into sound. Well, it could probably work. What do you need for that? You need to encode 256 different frequencies, each representing one byte. Now it depends on what byte rate you want to use for that. And also if the FFT will read it. The FFT is very accurate so it should read it. Let's say you want to encode at a frequency of 150 Hz in steps of 50 Hz, so byte 255 will be at a frequency of 12900 Hz. For that, you need a sampling frequency of at least 25000 Hz (that's fine). To fully modulate the 50 Hz oscillation, you will need: 1) sample time t = 1 / f so 1 / 50 = 0.02 seconds, number of samples = t * _SndRate = 0.02 * 44100 = 882 samples.

So the higher the sound frequency you choose, the faster the data reading will be, but only theoretically, it also depends a lot on how exactly the FFT can parse it. Instead of the checksum used in files, in this case I would rather be motivated by network operation, where you first send a signal to the other party (starting a data block), it will be fixed at 256 bytes (so it will wait for 256 frequencies), after receiving it, the other party confirms that the data has been received, their sum is sent for checking and the process is repeated. It follows that the program that reads data from the sound must decode the first block, it will be something like a file header, where there will be a description of how many blocks are being sent. then it will be read and finally the sum is performed to see if the length of the data in the block agrees with what was declared. That then depends on how you will operate it. But in general, even sending data using a walkie-talkie is very slow. If the sound is distorted, the transmission must be repeated, it is not suitable for long distances. It depends on the power used and the radio band.

Interesting idea, this. I would simply do it like when you write a program to store images. One program stores the data, another displays it. This will be the same. One program takes a set of bytes, assigns a frequency to each byte, and modulates the signal accordingly. For functionality tests, skip the data transfer and write a second program that takes the stored data and analyzes it back and decomposes it into bytes.

You modulate the signal as a Sine with a given frequency, save the sound data in a binary file, you don't even have to use WAV for this. The second program opens the data file and converts it back to data. If this works, the only question is what the distortion will be during transmission in practice.

To transfer data between two computers in the form of sound, you will need a good quality microphone, or connect the SPK output of one computer to the LineIn of the other computer with a cable. But here we run into a problem, because so far I don't know of anything in QB64PE that can record sound from LineIn. It is probably necessary to use winmm.dll, but I currently don't have a program that can communicate with LineIn.

RE: Audio storage, stereo switching - Petr - 02-22-2025

Easy example. Volume leve is used as byte indicator.

Code: (Select All)



ReDim Shared CodedSound(0) As Single



CodeSnd "Hi this is text example, this time volume level is used as byte - modulation. This maybe can be good between two computers, but not good for radio signal."



For i = 1 To UBound(CodedSound)

    _SndRaw CodedSound(i)

Next i



Print EncodeSnd







Sub CodeSnd (Text As String)

    ReDim CodedSound(Len(Text) * 100) As Single

    Dim As Long I, L, G

    SignalDuration = 100 'one signal amplitude lenght is 100 samples

    stp = 1 / 256

    For L = 1 To Len(Text)

        For G = 1 To 100

            CodedSound(I) = Asc(Text, L) * stp

            I = I + 1

        Next G

    Next L

End Sub



Function EncodeSnd$

    Dim As Long i, g, j

    Dim As Single s

    stp = 1 / 256

    For i = 1 To (UBound(CodedSound) \ 100)

        s = 0

        For g = i To i + 99

            s = s + CodedSound(j)

            j = j + 1

        Next g

        s = (s / 100) * 256

        e$ = e$ + Chr$(s)

    Next i

    EncodeSnd$ = e$

End Function

RE: Audio storage, stereo switching - madscijr - 02-24-2025

(02-22-2025, 08:56 PM)Petr Wrote: If I understand correctly, you want to encode data into sound. Well, it could probably work. What do you need for that? You need to encode 256 different frequencies, each representing one byte.

That makes sense. Back in the cassette days, if one tape player's motor was slower than another's, wouldn't the pitch / frequency of the sounds be slightly different depending on the tape machine used? How did they get around this?

(02-22-2025, 08:56 PM)Petr Wrote: The second program opens the data file and converts it back to data. If this works, the only question is what the distortion will be during transmission in practice.

Yes, that's why I was wondering what kind of checksum or redundancy might be used.

(02-22-2025, 08:56 PM)Petr Wrote: To transfer data between two computers in the form of sound, you will need a good quality microphone, or connect the SPK output of one computer to the LineIn of the other computer with a cable. But here we run into a problem, because so far I don't know of anything in QB64PE that can record sound from LineIn. It is probably necessary to use winmm.dll, but I currently don't have a program that can communicate with LineIn.

This might be something a @SpriggsySpriggs API solution could help with?

I guess for now we could record the audio with some other program and save it to a WAV file, and the decoding program could extract the data from that?

Thanks for your answer, I will try running your code later!

RE: Audio storage, stereo switching - Petr - 02-24-2025

So how it was solved on the cassette. First there was the synchronization tone. Based on that, the computer recognized the speed at which the cassette was spinning and then read the data accordingly. Basically, you can say that if the computer expected a frequency of 1000 Hz but got 1200 Hz, it slowed down the next incoming data (in memory) just like sound can be slowed down (that's how I understood it). Then the actual transmission. It wasn't that each BYTE had its own frequency, but each BITE had one. Zero had one and the 1 hat another frequency. Before the entire byte was sent, start bits were also sent. Then came the Byte (a series of 8 bits) and then the stop bits. Based on that, the computer could distinguish individual bytes. This all took place in a predetermined block of a certain size. This was usually set on the cassette in the recording at the beginning. When the entire block was sent, an XOR check calculation was performed and if the size of the received data agreed with the expected state, the recording continued, otherwise an error was declared and it was interrupted. There were more control mechanisms, for example, the CRC check calculation, where a value was recorded at the end of the block on the tape, how many bits were to be read, and the computer compared it with what it had read. Example of an XOR calculation: As the individual bytes arrive one after the other, you perform an XOR between them. Then the value that you are supposed to wait for is stored on the tape and it must match. Example for 10 bytes: (assume that the bytes come one after the other from the cassette in the order 5A, B3 you get E9 comes 7E you get 97 comes 44 you get D3 and so on.)

1. 5A XOR B3 = E9
2. E9 XOR 7E = 97
3. 97 XOR 44 = D3
4. D3 XOR 21 = F2
5. F2 XOR 9F = 6D
6. 6D XOR C2 = AF
7. AF XOR 3D = 92
8. 92 XOR 88 = 1A
9. 1A XOR FE = E4

So and at this point on the cassette there will be byte E4. If it matches, recording continues.

CRC is more complicated, there are other and better experts for that, it's basically binary division where you add the binary remainder after division to the end of the block. And it has to match. It's divided by a so-called polynom and you can do 8-bit, 16-bit or 32-bit division. I don't know the details, so you'd have to study this somewhere, or maybe someone will want to explain it to you.
This method is more reliable than XOR but also more demanding to program. BUT heads up. In QB64PE there's a function _CRC32 for that Smile

- but you have to find the details.

Now for data transfer using sound. First, you open the transmitted file for binary reading. Then you send a synchronization sound signal, for example, first 3 seconds long for 1 and then 3 seconds long for zero. Based on this, the other side compares what the signal for 1 and 0 will look like.
Maybe send a short sound pulse beforehand so that the other side prepares for synchronization. Then you create a one-second buffer using DIM and start reading data from the file. You convert each read byte to a bit. If the bit is 1, you will modulate one frequency in the positive half-wave and if it is 0, you will modulate the other frequency in the negative half-wave. This should limit the influence of noise. Then you will play these modulated frequencies via _SNDRAW.

On the receiving side there will be a computer with a sensitive microphone. The program will contain an FFT and will analyze the received signal by frequency. You can find the FFT in my equalizer program. Spriggsy recently added a microphone program here.
Step 1: You will wait for the start signal. Just any sound. The moment it stops, you start listening to what a logical one sounds like (three seconds, for example, as I wrote at the beginning). You save the sample in memory. Then you listen to what a logical zero sounds like. You save the sample in memory. And then you receive and record the incoming query from the microphone, you have to take into account that the FFT processing will take a while, so feel free to let the incoming memory fill up and slowly remove from it (probably slower than it will fill up). According to the two frequencies being processed, you write individual bits to BYTE and when you fill BYTE, you save it in an array and start filling the next one.

The way you determine the data blocks after which you will perform the check is up to your imagination. It's mostly like a file - after reading what a logical one and a logical zero look like, there should be a header that tells the program whether XOR or CRC is used and the length of the data block after which a check BYTE will be stored in the case of XOR. Theoretically, it should work, but it will take patience and perseverance.

example how "send" 8 bites:

Code: (Select All)





CLS

datas$ = "10110101"  ' binary example

SendData datas$

END

SUB SendData (datas AS STRING)

    FOR i = 1 TO LEN(datas)

        bit = VAL(MID$(datas, i, 1))

        SendBit bit

    NEXT i

END SUB

SUB SendBit (bit AS INTEGER)

    frequency = 0

    IF bit = 0 THEN

        frequency = 1200  ' 0 = 1200 Hz

    ELSE

        frequency = 2400  ' 1 = 2400 Hz

    END IF

    duration = 0.1  ' 100 ms to bit

    volume = 0.5

    ' Generate sinus tone

    DIM soundData(4410) AS SINGLE  ' 4410 samples if _SnsRate return 44100

    FOR i = 0 TO 4409

        soundData(i) = volume * SIN(2 * _PI * frequency * i / 44100)

        _SndRaw soundData(i)

    NEXT i

END SUB

On the receiving side, you would run something like this (I'm not sure that the file will be loaded into _MemSound if it is taken as a stream in the Spriggsy program)

Code: (Select All)



' Input source: We assume the signal is coming through a microphone or an audio input

Source = _SndOpen("test.wav") ' Use an input WAV file for testing    - or microphone source?

Dim Snd As _MEM

Dim A As Long

Dim Shared N As Long

Dim Block(1023) As Single

Dim RealPart(1023) As Single, ImagPart(1023) As Single

Dim SamplingRate As Single



N = 1024 ' FFT Block size

Snd = _MemSound(Source, 0)

SamplingRate = _SndRate



Print "Starting decoding..."

Do Until A& = Snd.SIZE

    ' Load a block of samples

    For i = 0 To N - 1

        If A& >= Snd.SIZE Then Exit For

        Block(i) = _MemGet(Snd, Snd.OFFSET + A&, Single)

        RealPart(i) = Block(i)

        ImagPart(i) = 0

        A& = A& + 4 ' Mono Snd (4 bytes per sample)

    Next i



    ' Apply FFT to the block

    Call FFT(RealPart(), ImagPart(), N)



    ' Decode the bit based on the dominant frequency

    Call DecodeBit(RealPart(), ImagPart(), N)



    ' Wait for the next block

    _Delay 0.1

Loop

_MemFree Snd

End



Sub DecodeBit (RealPart() As Single, ImagPart() As Single, N As Long)

    Dim Frequency As Single

    Frequency = GetDominantFrequency(RealPart(), ImagPart(), N)



    ' Detect bits according to the dominant frequency

    If Frequency > 1100 And Frequency < 1300 Then

        Print "Bit: 0"

    ElseIf Frequency > 2300 And Frequency < 2500 Then

        Print "Bit: 1"

    Else

        Print "Unknown frequency: "; Frequency

    End If

End Sub



Function GetDominantFrequency (RealPart() As Single, ImagPart() As Single, N As Long)

    Dim k As Long

    Dim MaxAmplitude As Single, Amplitude As Single

    Dim DominantIndex As Long

    Dim Frequency As Single



    MaxAmplitude = 0

    DominantIndex = 0

    For k = 0 To N / 2

        ' Calculate amplitude using Pythagorean theorem

        Amplitude = Sqr(RealPart(k) ^ 2 + ImagPart(k) ^ 2)

        If Amplitude > MaxAmplitude Then

            MaxAmplitude = Amplitude

            DominantIndex = k

        End If

    Next k



    ' Convert index to frequency

    Frequency = DominantIndex * _SndRate / N

    GetDominantFrequency = Frequency

End Function



' Fast Fourier Transform (FFT)

Sub FFT (RealPart() As Single, ImagPart() As Single, N As Long)

    Dim i As Long, j As Long, k As Long, m As Long, stp As Long

    Dim angle As Double

    Dim tReal As Double, tImag As Double, uReal As Double, uImag As Double



    ' Bit-reverse permutation

    j = 0

    For i = 0 To N - 1

        If i < j Then

            Swap RealPart(i), RealPart(j)

            Swap ImagPart(i), ImagPart(j)

        End If

        k = N \ 2

        Do While (k >= 1 And j >= k)

            j = j - k

            k = k \ 2

        Loop

        j = j + k

    Next i



    ' FFT: Loop over the size of each level (N)

    m = 1

    Do While m < N

        stp = m * 2

        angle = -3.14159265359 / m

        For k = 0 To m - 1

            uReal = Cos(k * angle)

            uImag = Sin(k * angle)

            For i = k To N - 1 Step stp

                j = i + m

                tReal = uReal * RealPart(j) - uImag * ImagPart(j)

                tImag = uReal * ImagPart(j) + uImag * RealPart(j)

                RealPart(j) = RealPart(i) - tReal

                ImagPart(j) = ImagPart(i) - tImag

                RealPart(i) = RealPart(i) + tReal

                ImagPart(i) = ImagPart(i) + tImag

            Next i

        Next k

        m = stp

    Loop

End Sub

And for that you will need the microphone program from @SpriggsySpriggs

Code: (Select All)



OPTION _EXPLICIT



_TITLE "mciSendString Record Test"



StartRecording

DIM x

DIM y



x = TIMER(0.01)

DO

    y = TIMER(0.01)

    CLS

    PRINT "Recording...Press any key to stop"

    PRINT y - x

    _DISPLAY

LOOP UNTIL INKEY$ <> ""



StopRecording

SaveRecording ("test.wav")

PlayRecording ("test.wav")



_TITLE "Waveform Display"

SCREEN GetWaveform("test.wav", "640x480")





DECLARE DYNAMIC LIBRARY "WINMM"

    FUNCTION mciSendStringA% (lpstrCommand AS STRING, lpstrReturnString AS STRING, BYVAL uReturnLength AS _UNSIGNED LONG, BYVAL hwndCallback AS LONG)

    FUNCTION mciGetErrorStringA% (BYVAL dwError AS LONG, lpstrBuffer AS STRING, BYVAL uLength AS _UNSIGNED LONG)

END DECLARE



SUB StartRecording

    DIM a AS LONG

    a = mciSendStringA("open new type waveaudio alias capture" + CHR$(0), "", 0, 0)

    IF a THEN

        DIM x AS INTEGER

        DIM MCIError AS STRING

        MCIError = SPACE$(255)

        x = mciGetErrorStringA(a, MCIError, LEN(MCIError))

        PRINT MCIError

        END

    ELSE

        a = mciSendStringA("set capture time format ms bitspersample 16 channels 2 samplespersec 48000 bytespersec 192000 alignment 4" + CHR$(0), "", 0, 0)

        a = mciSendStringA("record capture" + CHR$(0), "", 0, 0)

    END IF

END SUB



SUB StopRecording

    DIM a AS LONG

    a = mciSendStringA("stop capture" + CHR$(0), "", 0, 0)

END SUB



SUB SaveRecording (file AS STRING)

    DIM a AS LONG

    a = mciSendStringA("save capture " + CHR$(34) + file + CHR$(34) + CHR$(0), "", 0, 0)

    a = mciSendStringA("close capture" + CHR$(0), "", 0, 0)

END SUB



SUB PlayRecording (file AS STRING)

    DIM a AS LONG

    a = mciSendStringA("play " + CHR$(34) + file + CHR$(34) + CHR$(0), "", 0, 0)

END SUB



FUNCTION GetWaveform& (file AS STRING, size AS STRING)

    IF _FILEEXISTS("output.png") THEN

        KILL "output.png"

    END IF

    SHELL _HIDE "ffmpeg -i " + CHR$(34) + file + CHR$(34) + " -filter_complex " + CHR$(34) + "showwavespic=s=" + size + CHR$(34) + " -frames:v 1 output.png"

    GetWaveform = _LOADIMAGE("output.png", 32)

END FUNCTION

RE: Audio storage, stereo switching - madscijr - 02-25-2025

(02-24-2025, 09:29 PM)Petr Wrote: So how it was solved on the cassette.
...
Theoretically, it should work, but it will take patience and perseverance.
...
example how "send" 8 bites:
...

Petr, this is really fascinating, thank you so much for taking the time to explain and even share sample code. I think this could benefit a lot of the programmers down the road who just want to learn more about how to encode and decode information. Whether it's with sound or some other medium, many of the concepts will carry over.

It may take me some time to absorb all this, but I think you covered all the important steps.

Thanks again - I look forward to playing with all this code and seeing what it can do.