10. Placeholders and Variables
Here is a question that I’d like you to think about as we go through the contents of this video. Let’s say we have a TensorFlow program which is seeking to learn linear regression I e. This is a TensorFlow implementation of linear regression. Are the regression parameters going to be variables, placeholders or constants? This is in terms of entities in the TensorFlow computation graph. Will the regression parameters that’s the slope and the intercept. Will there be variables placeholders or constants? Because linear regression is such a common problem and so many folks are interested in solving it, there are many different approaches to linear regression.
One of these is to make use of cookie cutter implementations. These are available in tools such as Excel, for instance. You can make use of the worksheet functions linst, RSQ, slope and intercept to solve linear regression almost instantaneously. Here we are going to adopt a different approach. We are going to adopt the Iterative machine learning based approach. We will start with some values. Let’s call them A start and B start. These simply represent some initial estimates of the values of those constants. We will then go ahead and refine the values of these constants through something known as the training process.
This is exactly the same procedure that we will follow later on with more complex neural networks or other machine learning algorithms. Notice that these initial values of A and B could be badly off. They could result in a line which looks like that on screen. Now we will calculate the least square error, which is the sum of the squares of all these dotted lines, and that will be a very large quantity. This initial estimate of A start and B start yields a very high error. By the way, you should know that we need not even use all of the data points in our training data set for this exercise. We’ll have more to say on the exact details of this gradient descent optimization later.
In any case. So we calculate the least square error, we find that it’s very bad, and we adjust the values of A and B. This gives us a slightly better estimate. Let’s call it AI plus bix. These new values have a much lower value of the error, and by iteratively adjusting those values, we will ultimately converge on the best fitting solution that will be the best fit regression line, which actually minimizes the least square errors of our residuals. This is a fundamentally Iterative process. And this is also an example of a supervised learning algorithm. In general, any machine learning algorithm in which we train our model by using points where we already know the correct answer is known as a supervised approach.
Here we do know the correct answer for each of these points in our training data. We do know the actual values of y given the values of x. We are using this knowledge of those values to find the constants a and B. Unsupervised learning could be applied to cases like clustering auto detecting association rules. Typically, unsupervised learning is used only when you do not have a large number of training data points for which you know the correct answer already, almost everything that we just discussed was fairly generally applicable to all machine learning problems.
Let’s now turn to a couple of important concepts which are more or less specific to TensorFlow. Let’s start by understanding what placeholders are. In a nutshell, placeholders are a mechanism in TensorFlow to specify inputs into a graph.These are values that you will specify while training that machine learning model. In the case of linear regression, for instance, maybe you have one model which is going to measure the price of Google stock as predicted by the Dow Jones equity index. Maybe you have another linear regression model which is very similar, which is going to use a different pair of variables. For instance, maybe it’s life expectancy versus income.
Now, there is no real fundamental difference between the structure of these models. All that varies are the values of the x and y values. And this is exactly what placeholders come in handy for a model.A machine learning model needs to have the ability to accept different x and Y values. In TensorFlow, this is achieved by using placeholders. And ultimately, what are these different values in terms of the computation graph? They are simply our input nodes. So if we go back to the simple example of a TensorFlow computation graph, the input nodes A and B are the placeholders because we will vary the values of A and B each time we execute this graph.
So really placeholders literally hold the place of sensors that are going to be fed into the computation graph at runtime, and in this way they act as input nodes in the computation graph. This will become a lot more clear with the accompanying demos. In this we will perform some simple math operations and specify placeholders. This will make it pretty concrete. If placeholders are the input nodes in the computation graph, then variables are parameters of the machine learning model. In the case of our regression line, we had two parameters, A and B. Recall that we had started with initial values a start and B start, and then these values were tweaked during the training process. And at the end of this process we had the values corresponding to the best fitting regression line.
Clearly, these values A and B are parameters of this machine learning model. And more importantly, these parameters are going to be altered during the training process. This means that the values of A and B will change, they will vary and quantities like these are called variables in TensorFlow. Variables and placeholders have fundamentally different functions, so please be sure to understand the difference. Variables are a TensorFlow construct which are required for the training process because the algorithm is iteratively going to approach the solution, and the machine learning model therefore, must have the ability to hold constantly changing values of its model parameters.
An example of these model parameters are the constraints A and B in a regression equation, so that’s the use case for variables in TensorFlow, on the other hand, placeholders have a completely different use case. We need placeholders so that machine learning algorithms or the same algorithm can be applied to a variety of problems. Placeholders allow us to have just one trained model, which then has the ability to accept different x and y values. This is the use case for placeholders, which again are input nodes in the computation graph. Placeholders and variables, along with constructs are the three types of tensors or the three types of data items.
In TensorFlow. Constraints are immutable values which do not change placeholders are input values which are going to be assigned once for one run of a model. They will not be changed after that. These are the inputs into the computation graph, and variables are indeed variable. They are constantly recomputed at each iteration during the training of a TensorFlow model. But just to be clear, once we are done with the training process and the variables have assumed their final converged values, those same values of the variables will be used without changing them for different values of the placeholders.
In the context of TensorFlow sessions, variables are mutable values, but they will hold their values across multiple calls to session run. Again, this is important. Variables are mutable, but those values will persist across multiple calls to session run. Let’s return to the question we posed at the start of this video. In a TensorFlow program that implements linear regression using machine learning, clearly, the regression parameters are going to have their values updated during the training process. The program will converge on the best values of those regression parameters on the basis of some optimization by minimizing some cost function I e.
The mean square error. The type of construct used for such entities are variables. Variables, as their names would suggest, are going to change their values during the course of the program. These regression parameters cannot be placeholders. Remember that placeholders are the input nodes in a neural network. In the case of linear regression, the placeholders will be the values of x and y, and constants, of course, are constants, such as pi, as their name would suggest.
11. Lab: Placeholders
At the end of this lecture, you should know what a feed dictionary is and how it is used in TensorFlow. In this lecture, we learn one more new concept in TensorFlow that of placeholders. In any program, whether it’s using TensorFlow or any other programming language, you need a way to input data into the program. You need a way way where a bunch of operations are performed, and you can specify different inputs and have those operations be performed on different inputs. These inputs are specified using placeholders in TensorFlow. The code for this practical demonstration will be in the file simple math with placeholders.
At the very top, you import TensorFlow and the Google datalab ML library x and y. Here are placeholder, which you instantiate using TF placeholder. Placeholders typically do not have any input values. You can specify some if you want to, but the most important thing is they serve as input nodes. When you have a bunch of data that you want to feed into this program, you’ll feed it into the placeholder. We’ve set up a bunch of characteristics of this placeholder. We say that it holds integer elements. Its shape is three, which means it’s a vector of three elements, and its names are x and y.
Placeholders are essentially just tensors, which do not take on their values till we actually execute the computation graph, let’s perform a bunch of operations on these placeholders, the same operations we did before. Reduce sum on x and reduce prod on y. The final div computation divides sum x by prod y. Some x cannot be computed before we provide the input for placeholder x. Prod y cannot be computed till we have the information for placeholder y. The input for placeholder y instantiate the session using the with statement. This will ensure that we close the session automatically once we are done with it.
Let’s take a look at the very first statement that we execute. We call session run on SUMX. SUMX cannot be computed till we pass in the value for the x placeholder. Right now, x has no values assigned to it. The computation can’t run on nothing. By executing the operation of a computation node, you specify the value for the placeholders that go into that node’s computation using something known as a feed dictionary. The feed dictionary is simply a JSON structure that you pass in as an argument to session run, which specifies the value for the placeholders used in that computation.
Notice when you call session run on sum underscore x, the feed dictionary specifies the value for x, which is a tensor a 1D tensor with 100 203 hundred as its primitive values. The next statement, the session dot run to compute prod underscore y. The feed dictionary specifies the value for y. Again, it’s a vector or a one dimensional tensor. A tensor of rank one with values 1122 and 33 come to the very last print statement within PF session. This prints the division. The division of SUMX by prod y SUMX requires the placeholder values for x and prod y requires the placeholder values for y.
Both have to be specified within our feed dictionary. For every node computation which involves placeholders, you need to specify values for them using a feed dictionary passed in as an argument to session run. Executing these statements will give you the result for all of these computations. Sum of x is 600. The feed dictionary passed in 100 203 hundred. If you sum them up, you get 600 prod of Y. Multiplies the values that were passed into the Y placeholder. That is, eleven multiplied by 22 multiplied by 33. And finally you get the final division. Sum of x divided by prod of y uses the values from the feed dictionary that was passed into that computation.
The feed dictionary here had x values 100 203 hundred. So the sum of that is 600, and the y values were one, two and three. If you multiply one by two by three, you get 6600. Divided by six is equal to 100. The code that I have here on screen is a little bit buggy. I have fixed that in the source code. So the code that you run should not be buggy, and it should be just fine. The statements which initialized and closed the writer which writes out the summary to TensorBoard needed to be indented so that it’s part of the with statement instantiating the session. Writing out the summary requires access to the session so we can pass in the session graph run TensorBoard.
Click on the helpful link that’s provided to you within this notebook and you’ll see the session graph displayed on screen. This is, of course, under the graphs tab. Let’s see how the placeholders that we use are represented. If you click on x and you view the pane on the top right, you’ll notice that x is a placeholder. It also shows you the type, shape and the dimensions of the placeholder as we had set it up in our program. As usual, whenever you visualize stuff on TensorBoard, it helps to kind of click around and see how data flows within your program.
The important thing, which I forgot to tell you earlier, is once you’re done with TensorBoard, switch back to your main program and kill the TensorBoard process. It’s just good practice to clean up after yourself in every program. In this particular program that you see on screen, I have forgotten to close the session. That’s a miss on my end. So what’s a free dictionary and why is it used? Any TensorFlow program should be ready to accept at runtime. When nodes are computed, the placeholders that hold this information are fed into session run using a feed dictionary. A feed dictionary in TensorFlow is essentially a way to specify when you run a computation graph, what input values the graph should work on.