Here we will demonstrate how you can use TensorFlow.js with textual data right in the browser, as well as the power of transformer architecture models like USE, for accomplishing Natural Language Processing tasks and building chatbots.
TensorFlow + JavaScript. The most popular, cutting-edge AI framework now supports the most widely used programming language on the planet. So let’s make text and NLP (Natural Language Processing) chatbot magic happen through Deep Learning right in our web browser, GPU-accelerated via WebGL using TensorFlow.js!
You are welcome to download the project code.
Ay! ‘Tis a Shakespeare. In this article – the last in the series – we’ll generate some Shakespearean monologue using AI.
Setting Up TensorFlow.js Code
This project runs within a single web page. We will include TensorFlow.js and Universal Sentence Encoder (USE), which is a pre-trained transformer-based language processing model. We’ll print the bot output to the page. Two of the additional utility functions, dotProduct
and zipWith
, from the USE readme example, will help us determine sentence similarity.
<html>
<head>
<title>Shakespearean Monologue Bot: Chatbots in the Browser with TensorFlow.js</title>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/universal-sentence-encoder"></script>
</head>
<body>
<h1 id="status">Shakespearean Monologue Bot</h1>
<pre id="bot-text"></pre>
<script>
function setText( text ) {
document.getElementById( "status" ).innerText = text;
}
const dotProduct = (xs, ys) => {
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
return xs.length === ys.length ?
sum(zipWith((a, b) => a * b, xs, ys))
: undefined;
}
const zipWith =
(f, xs, ys) => {
const ny = ys.length;
return (xs.length <= ny ? xs : xs.slice(0, ny))
.map((x, i) => f(x, ys[i]));
}
(async () => {
})();
</script>
</body>
</html>
TinyShakespeare Dataset
For this project, our bot will generate its own Shakespeare script using quotes from the TinyShakespeare dataset. It contains 40 thousand lines of text from the various Shakespeare plays. We’ll use it to create a collection of phrases and their "next-phrases."
Let’s go through every line to fill a message array and a matching response array. The code should look like this:
let shakespeare_lines = await fetch( "web/tinyshakespeare.txt" ).then( r => r.text() );
let lines = shakespeare_lines.split( "\n" ).filter( x => !!x );
let messages = [];
let responses = [];
for( let i = 0; i < lines.length - 1; i++ ) {
messages.push( lines[ i ] );
responses.push( lines[ i + 1 ] );
}
Universal Sentence Encoder
The Universal Sentence Encoder (USE) is "a [pre-trained] model that encodes text into 512-dimensional embeddings." For a complete description of the USE and its architecture, please see the Improved Emotion Detection article earlier in this series.
The USE is easy and straightforward to work with. Let’s load it up in our code right before we define our network model and use its QnA dual encoder, which will give us full-sentence embeddings across all queries and all answers, which should perform better than word embeddings. We can use this to determine the most similar current message and response.
setText( "Loading USE..." );
let encoder = await use.load();
setText( "Loaded!" );
const model = await use.loadQnA();
Shakespeare Monologue in Action
Because the sentence embeddings already encode similarity into its vectors, we don’t need to train a separate model. Starting with the hard-coded line, "ROMEO:"
, every 3-seconds, we’ll choose a random subset of 200 lines and let USE do the hard work. It will figure out which of those lines is the most similar to the last printed line using the QnA encoder, and then look up the response.
setInterval( async () => {
const numSamples = 200;
let randomOffset = Math.floor( Math.random() * messages.length );
const input = {
queries: [ text ],
responses: messages.slice( randomOffset, numSamples )
};
let embeddings = await model.embed( input );
tf.tidy( () => {
const embed_query = embeddings[ "queryEmbedding" ].arraySync();
const embed_responses = embeddings[ "responseEmbedding" ].arraySync();
let scores = [];
embed_responses.forEach( response => {
scores.push( dotProduct( embed_query[ 0 ], response ) );
});
let id = scores.indexOf( Math.max( ...scores ) );
text = responses[ randomOffset + id ];
document.getElementById( "bot-text" ).innerText += text + "\n";
});
embeddings.queryEmbedding.dispose();
embeddings.responseEmbedding.dispose();
}, 3000 );
Now, when you open the page, it will begin to write lines of Shakespeare every 3 seconds.
Finish Line
Here is the code that puts it all together:
<html>
<head>
<title>Shakespearean Monologue Bot: Chatbots in the Browser with TensorFlow.js</title>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/universal-sentence-encoder"></script>
</head>
<body>
<h1 id="status">Shakespearean Monologue Bot</h1>
<pre id="bot-text"></pre>
<script>
function setText( text ) {
document.getElementById( "status" ).innerText = text;
}
const dotProduct = (xs, ys) => {
const sum = xs => xs ? xs.reduce((a, b) => a + b, 0) : undefined;
return xs.length === ys.length ?
sum(zipWith((a, b) => a * b, xs, ys))
: undefined;
}
const zipWith =
(f, xs, ys) => {
const ny = ys.length;
return (xs.length <= ny ? xs : xs.slice(0, ny))
.map((x, i) => f(x, ys[i]));
}
(async () => {
let shakespeare_lines = await fetch( "web/tinyshakespeare.txt" ).then( r => r.text() );
let lines = shakespeare_lines.split( "\n" ).filter( x => !!x );
let messages = [];
let responses = [];
for( let i = 0; i < lines.length - 1; i++ ) {
messages.push( lines[ i ] );
responses.push( lines[ i + 1 ] );
}
setText( "Loading USE..." );
let encoder = await use.load();
setText( "Loaded!" );
const model = await use.loadQnA();
let text = "ROMEO:";
setInterval( async () => {
const numSamples = 200;
let randomOffset = Math.floor( Math.random() * messages.length );
const input = {
queries: [ text ],
responses: messages.slice( randomOffset, numSamples )
};
let embeddings = await model.embed( input );
tf.tidy( () => {
const embed_query = embeddings[ "queryEmbedding" ].arraySync();
const embed_responses = embeddings[ "responseEmbedding" ].arraySync();
let scores = [];
embed_responses.forEach( response => {
scores.push( dotProduct( embed_query[ 0 ], response ) );
});
let id = scores.indexOf( Math.max( ...scores ) );
text = responses[ randomOffset + id ];
document.getElementById( "bot-text" ).innerText += text + "\n";
});
embeddings.queryEmbedding.dispose();
embeddings.responseEmbedding.dispose();
}, 3000 );
})();
</script>
</body>
</html>
To Sum It Up
This article, along with the others in our series, demonstrated how you can use TensorFlow.js with textual data right in the browser, as well as the power of transformer architecture models like USE, for accomplishing Natural Language Processing tasks and building chatbots.
I hope these examples will inspire you to do even more with AI and Deep Learning. Build away and don’t forget to have fun while doing so!
Raphael Mun is a tech entrepreneur and educator who has been developing software professionally for over 20 years. He currently runs Lemmino, Inc and teaches and entertains through his Instafluff livestreams on Twitch building open source projects with his community.