People have been walking around with microphones in their pockets for years but only recently has mobile browser support reached the level required to build a fully cross-platform, feature-rich, browser-based audio recorder. GMass’s sister product is Wordzen, also an extension for Gmail, but with a completely different purpose — to write all of your emails for you. In the coming days, Wordzen will begin working inside the Gmail app on your mobile phone, and in order to achieve that, we had to figure out how to record audio from within the web browser on both iOS (iPhone) and Android.
In this post, I’m going to detail the key components of our audio recorder and uploaded application. Keep reading, or download a simple demo now.
The application needs to record audio, show visual feedback, allow playback, and finally, the audio needs to be uploadable to my server for transcribing.
One of the newer APIs available is the MediaRecorder API. My first attempt at building this application started with this class. I implemented the entire application and it worked great on my desktop. It was easy to capture audio and the data was already compressed into .ogg format and ready to ship to my server. Then I tried it on iOS. It turns out that the MediaRecorder API is not supported and wouldn’t meet my needs. After I stopped cursing Apple, I began again from scratch.
After searching unsuccessfully for a demo that worked on all platforms, I was about to give up but I did find multiple demos that worked on individual platforms and so I set to combining what I’d learned into a single class that would work on all platforms. Even after finding examples that worked on my iPhone, I still found my demo didn’t work because of the security built into iOS that requires that microphone actions can only be triggered via a click. That was the last piece of the puzzle that allowed me to construct a working demo and it revolves around three steps:
- Capture the microphone so we can begin recording
- Accumulate captured audio data into a series of byte array chunks
- Combine the chunks into one large array and massage the array into the format of a .wav file
Step 1: Capture the microphone
Capturing the microphone revolves around creating an audio context and then asking the user for permission to use it. There are several shims involved to make this work in each platform so we have to look for what’s supported to get the right instances:
audioCtx = new (window.AudioContext || window.webkitAudioContext)(); if (audioCtx.createJavaScriptNode) { audioNode = audioCtx.createJavaScriptNode(bufferSize, 1, 1); } else if (audioCtx.createScriptProcessor) { audioNode = audioCtx.createScriptProcessor(bufferSize, 1, 1); } else { throw 'WebAudio not supported!'; } audioNode.connect(audioCtx.destination);
Once we’ve done the setup, we can then ask the browser to prompt the user for access to the microphone. The result of this call is a promise that we can use to trigger a callback when it successfully completes:
navigator.mediaDevices.getUserMedia({audio: true}) .then(onMicrophoneCaptured) .catch(onMicrophoneError);
Once the microphone has been captured, we can register to receive events about audio data coming in by listening to the onAudioProcess event:
audioInput = audioCtx.createMediaStreamSource(microphone); audioInput.connect(audioNode); audioNode.onaudioprocess = onAudioProcess;
Step 2: Accumulate captured audio
Although the audio process event returns multi-channel data, for our purposes, we only need one channel of audio data. This also creates a smaller file that will make uploads faster. When data arrives. We can do this by simply getting one channel’s data and adding it to our list of recorded data. This is also a good time to notify anybody listening that we’re still recording so we can show the current duration or other visualization:
recordedData.push(new Float32Array(e.inputBuffer.getChannelData(0))); self.duration = new Date().getTime() - self.startDate.getTime(); config.onRecording && config.onRecording(self.duration);
Step 3: Massage the array into the format of a .wav file
Now that we have a large array of audio data “chunks”, we need to combine those chunks into a single array and generate a .wav file. Although we could do this synchronously, a long recording could hang the user’s browser. To avoid this, we’ll offload the heavy lifting work to a Web Worker that will return a DataView that we can use to construct a Blob which can be played or uploaded.
The first step of the .wav generation process is combining the chunks of data into a single Float64Array which simply creating a large enough buffer and then setting the chunks into it, advancing the offset by the size of each chunk:
var result = new Float64Array(count); var offset = 0; var lng = channelBuffer.length; for (var i = 0; i < lng; i++) { var buffer = channelBuffer[i]; result.set(buffer, offset); offset += buffer.length; }
Now that we have that complete list of bytes in one array, we can write the .wav file starting with the header and followed by the audio data bytes. To do this, I referred to the .wav file format specification. Because the data we’ve recorded is floating point and in the range of -0.5 to +0.5, we need to scale the values up to 16-bit signed integers. We can do that by multiplying each value of recorded data by 2^16 – 1 or 32,767 or 0x8000 – 1 or 0x7FFF:
for (var i = 0; i < dataLength; i++) { view.setInt16(index, data[i] * 0x7FFF, true); index += 2; }
Now that we have all the data in a DataView, we can convert the view into a Blob:
blob = new Blob([view], { type: 'audio/wav' });
And finally, we can play this blob in an <audio> tag:
document.getElementById('player').src = URL.createObjectURL(blob);
or pass it to the excellent wavesufer plugin to visualize it or play it:
wavesurfer = WaveSurfer.create({ container: '#waveform', waveColor: '#007BFF', progressColor: '#03A8F3' }); wavesurfer.loadBlob(blob);
Step 4: Add an Oscilloscope (optional)
It’s always nice to give the user a bit of feedback and we can do that by converting the audio data into a waveform that they can see while they’re speaking:
This visualization is simply a line drawn over a series of points. We can get those points from our data and scale them to fit the width and height of a Canvas element. The audio context class allows us to connect an analyzer and extract byte time data from it:
var analyser = audioCtx.createAnalyser(); analyser.fftSize = 2048; var bufferLength = analyser.frequencyBinCount; var dataArray = new Uint8Array(bufferLength); source.connect(analyser); analyser.getByteTimeDomainData(dataArray);
The byte time data is an array of signed byte values ranging from -128 to +127. By scaling these values to the height of the canvas, we can stroke a path to create a wave form.
for (var i = 0; i < bufferLength; i++) { var v = dataArray[i] / 128.0; var y = v * height / 2; i == 0 ? canvasCtx.moveTo(x, y) : canvasCtx.lineTo(x, y); x += sliceWidth; } canvasCtx.lineTo(canvas.width, canvas.height/2); canvasCtx.stroke();
Email marketing, cold email, and mail merge all in one tool — that works inside Gmail
TRY GMASS FOR FREE
Download Chrome extension - 30 second install!
No credit card required
It’s recording the audio but not showing waveforms
I’m finding that this doesn’t work on some iOS devices. `navigator.mediaDevices` is undefined. They are all on iOS 11.4. Only thing I can think of is the devices that worked have been hooked up to Xcode at some point.
i am also in this problem. any updates???
This happens when you add the WebApp to the homescreen. The WebView doesn#t support getUserMedia.
https://bugs.webkit.org/show_bug.cgi?id=185448
to get the canvas showing, first add a canvas to the “container” div,
An alternative text describing what your canvas displays.
then add
visualizer: {
element: document.getElementById(‘canvas’)
}
to the object passed into WzRecorder.
my HTML didn’t show up in that last reply.
use HTML from https://developer.mozilla.org/en-US/docs/Web/HTML/Element/canvas
figured out problem with iOS. If the device hasn’t been used in development with Xcode, then you have to turn on the “Camera & Microphone” option in Settings app under Safari.
Learning javascripts… how do i call the upload function
Greta process explained here to recording audio from mobile websites directly. Great for the IOS and Andriod based device users.
LOOKING FORWARD TO IT
I now feel bad i bothered replying and waiting all these days to see you did not approve my request. oh well…
Hi Jordan,
I’m not showing any requests from this email account. Please contact our support team through http://gmass.co/g/support
Hi Marvin, it was a very long post thanking you and sharing insights. I also wanted to learn how to encode to mp3 within inline worker. I have since solved the problem. one question tho? how do you prevent sound from the app from being recorded. eg when i tap a record button it plays a ‘start record sound’, but that gets captured by mic.
Hi, I tried this app. However, when I build the apk file and run in my phone, I get an error “Unable to Access Microphone”.
Any idea what I am missing ?
Thanks,
Sun
Wow! The first working demo for iOS! Wonderful work!!
Best regards
Marcophono
I’m trying to use this in a React application. I get an error saying “sampleRate is not defined”… any suggestions?
This error did not occur when it was in HTML…
Doesnt work on Safari nor iOS 🙁
As of iOS 13, just go to the url and use: Add to Homepage
Works on IOS but has audio pops or short gaps that occur several times a second in the wav playback of the recorded sound (most evident if you whistle a constant frequency into the microphone while recording). Any thoughts?
I see this too. No solution yet.
Doesn’t work on Firefox 73.0.1 (64-bit) on Mac OS 10.14.6 (Mojave)
Keep it up its great!!!!!
The .js file contains an upload function (‘this.upload = function (url, params, callback)’) but is not explained. How can it be triggered?
Great example. Thanks