Building Deep Neural Networks with C++ in TensorFlow

Selected from Matrices.io

Author: Florian Courtial

Translated by Machine Heart

Source: Robot Heart

Contributors: Li Zenan, Jiang Siyuan

The popular deep learning framework TensorFlow (The Chinese official account of TensorFlow was launched earlier this month) is built on C++, but the vast majority of people develop their models using TensorFlow in Python. With the improvement of the C++ API, it has become possible to build neural networks directly using C++. This article will introduce a simple implementation method.

Many people know that the core of TensorFlow is built on C++, but most of the functionalities of this deep learning framework are conveniently used only in the Python API.

When I wrote the previous article, my goal was to implement a basic Deep Neural Network (DNN) using only the C++ API and CuDNN in TensorFlow. In practice, I realized that we overlooked many things in this process.

Note that it is impossible to train a neural network using exotic operations; the error you are most likely to encounter is the lack of gradient computation. Currently, I am trying to migrate the gradient computations from Python to C++.

In this article, I will show you how to build a deep neural network using TensorFlow in C++ and estimate the price of a BMW 1 Series car based on conditions such as age, mileage, and fuel type. At present, we do not have a usable C++ optimizer, so you will see that the training code looks less appealing, but we will add that in the future.

This article follows the official guide for TensorFlow 1.4 C++ API: https://www.tensorflow.org/api_guides/cc/guide
Code GitHub: https://github.com/theflofly/dnn_tensorflow_cpp

Installation

We will run the TensorFlow framework in C++, and we need to try using the compiled library, but some people will definitely encounter problems due to the peculiarities of their environment. Building TensorFlow from scratch will avoid these issues while ensuring that the latest version of the API is used.

First, you need to install the bazel build tool. Here is the installation method: https://docs.bazel.build/versions/master/install.html

On OSX, brew is sufficient:

brew install bazel

You need to start building from the TensorFlow source files:

mkdir /path/tensorflow
cd /path/tensorflow
git clone https://github.com/tensorflow/tensorflow.git

Then you need to configure, such as whether to use GPU, you need to run the configuration script like this:

cd /path/tensorflow
./configure

Now we need to create a file to receive TensorFlow model code. Note that the first build will take a long time (10-15 minutes). The non-core C++ TF code is located in /tensorflow/cc, which is where we will create the model file. We also need a BUILD file for bazel to build the model.

mkdir /path/tensorflow/model
cd /path/tensorflow/model
touch model.cc
touch BUILD

We add bazel instructions to the BUILD file:

load("//tensorflow:tensorflow.bzl", "tf_cc_binary")

tf_cc_binary(
    name = "model",
    srcs = [
        "model.cc",
    ],
    deps = [
        "//tensorflow/cc:gradients",
        "//tensorflow/cc:grad_ops",
        "//tensorflow/cc:cc_ops",
        "//tensorflow/cc:client_session",
        "//tensorflow/core:tensorflow"
    ],
)

Basically, it will use model.cc to build a binary file. Now we can start writing our own model.

Reading Data

This data is extracted from the French website leboncoin.fr, then cleaned and normalized, and stored in a CSV file. Our goal is to read this data. The normalized source data is stored in the first row of the CSV file, and we need to use it to reconstruct the price output of the neural network. Therefore, we create data_set.h and data_set.cc files to keep the code clean. They generate a 2D array of floats from the CSV file and are used to feed into the neural network.

data_set.h

using namespace std;

// Meta data used to normalize the data set. Useful to
// go back and forth between normalized data.
class DataSetMetaData {
friend class DataSet;
private:
  float mean_km;
  float std_km;
  float mean_age;
  float std_age;
  float min_price;
  float max_price;
};

enum class Fuel {
    DIESEL,
    GAZOLINE
};

class DataSet {
public:
  // Construct a data set from the given csv file path.
  DataSet(string path) {
    ReadCSVFile(path);
  }

  // getters
  vector<float>& x() { return x_; }
  vector<float>& y() { return y_; }

  // read the given csv file and complete x_ and y_
  void ReadCSVFile(string path);

  // convert one csv line to a vector of float
  vector<float> ReadCSVLine(string line);

  // normalize a human input using the data set metadata
  initializer_list<float> input(float km, Fuel fuel, float age);

  // convert a price outputted by the DNN to a human price
  float output(float price);
private:
  DataSetMetaData data_set_metadata;
  vector<float> x_;
  vector<float> y_;
};
</float></float></float></float></float></float>

data_set.cc

#include <vector>
#include <fstream>
#include <sstream>
#include <iostream>
#include "data_set.h"

using namespace std;

void DataSet::ReadCSVFile(string path) {
  ifstream file(path);
  stringstream buffer;
  buffer << file.rdbuf();
  string line;
  vector<string> lines;
  while(getline(buffer, line, '\n')) {
    lines.push_back(line);
  }

  // the first line contains the metadata
  vector<float> metadata = ReadCSVLine(lines[0]);

  data_set_metadata.mean_km = metadata[0];
  data_set_metadata.std_km = metadata[1];
  data_set_metadata.mean_age = metadata[2];
  data_set_metadata.std_age = metadata[3];
  data_set_metadata.min_price = metadata[4];
  data_set_metadata.max_price = metadata[5];

  // the other lines contain the features for each car
  for (int i = 2; i < lines.size(); ++i) {
    vector<float> features = ReadCSVLine(lines[i]);
    x_.insert(x_.end(), features.begin(), features.begin() + 3);
    y_.push_back(features[3]);
  }
}

vector<float> DataSet::ReadCSVLine(string line) {
  vector<float> line_data;
  std::stringstream lineStream(line);
  std::string cell;
  while(std::getline(lineStream, cell, ','))
  {
    line_data.push_back(stod(cell));
  }
  return line_data;
}

initializer_list<float> DataSet::input(float km, Fuel fuel, float age) {
  km = (km - data_set_metadata.mean_km) / data_set_metadata.std_km;
  age = (age - data_set_metadata.mean_age) / data_set_metadata.std_age;
  float f = fuel == Fuel::DIESEL ? -1.f : 1.f;
  return {km, f, age};
}

float DataSet::output(float price) {
  return price * (data_set_metadata.max_price - data_set_metadata.min_price) + data_set_metadata.min_price;
}
</float></float></float></float></float></string></iostream></sstream></fstream></vector>

We must add these two files to the bazel BUILD file.

load("//tensorflow:tensorflow.bzl", "tf_cc_binary")

tf_cc_binary(
    name = "model",
    srcs = [
        "model.cc",
        "data_set.h",
        "data_set.cc"
    ],
    deps = [
        "//tensorflow/cc:gradients",
        "//tensorflow/cc:grad_ops",
        "//tensorflow/cc:cc_ops",
        "//tensorflow/cc:client_session",
        "//tensorflow/core:tensorflow"
    ],
)

Building the Model

The first step is to read the CSV file and extract two tensors, where x is the input and y is the expected real result. We use the previously defined DataSet class.

CSV dataset download link: https://github.com/theflofly/dnn_tensorflow_cpp/blob/master/normalized_car_features.csv

DataSet data_set("/path/normalized_car_features.csv");
Tensor x_data(DataTypeToEnum<float>::v(), 
              TensorShape{static_cast<int>(data_set.x().size())/3, 3});
copy_n(data_set.x().begin(), data_set.x().size(),
       x_data.flat<float>().data());

Tensor y_data(DataTypeToEnum<float>::v(), 
              TensorShape{static_cast<int>(data_set.y().size()), 1});
copy_n(data_set.y().begin(), data_set.y().size(), 
       y_data.flat<float>().data());
</float></int></float></float></int></float>

To define a tensor, we need to know its type and shape. In the data_set object, x data is stored in a vector, so we reduce the size to 3 (each saving three features). Then we use std::copy_n to copy data from the data_set object into the Tensor (a Eigen::TensorMap) underlying data structure. Now we have the data and TensorFlow data structure, it’s time to build the model.

You can easily debug a tensor:

LOG(INFO) << x_data.DebugString();

The unique aspect of the C++ API is that you need a Scope object to maintain the state of building a static computation graph and pass that object to each operation.

Scope scope = Scope::NewRootScope();

We need two placeholders, x containing features and y representing the corresponding price of each car.

auto x = Placeholder(scope, DT_FLOAT);
auto y = Placeholder(scope, DT_FLOAT);

Our network has two hidden layers, so we will have three weight matrices and three bias vectors. In Python, this is done directly by the underlying system; in C++, you must define a variable and then define an Assign node to assign a default value to that variable. We use RandomNormal to initialize our variables, which gives us random values following a normal distribution.

// weights init
auto w1 = Variable(scope, {3, 3}, DT_FLOAT);
auto assign_w1 = Assign(scope, w1, RandomNormal(scope, {3, 3}, DT_FLOAT));

auto w2 = Variable(scope, {3, 2}, DT_FLOAT);
auto assign_w2 = Assign(scope, w2, RandomNormal(scope, {3, 2}, DT_FLOAT));

auto w3 = Variable(scope, {2, 1}, DT_FLOAT);
auto assign_w3 = Assign(scope, w3, RandomNormal(scope, {2, 1}, DT_FLOAT));

// bias init
auto b1 = Variable(scope, {1, 3}, DT_FLOAT);
auto assign_b1 = Assign(scope, b1, RandomNormal(scope, {1, 3}, DT_FLOAT));

auto b2 = Variable(scope, {1, 2}, DT_FLOAT);
auto assign_b2 = Assign(scope, b2, RandomNormal(scope, {1, 2}, DT_FLOAT));

auto b3 = Variable(scope, {1, 1}, DT_FLOAT);
auto assign_b3 = Assign(scope, b3, RandomNormal(scope, {1, 1}, DT_FLOAT));

Then we use Tanh as the activation function to build three layers.

// layers
auto layer_1 = Tanh(scope, Add(scope, MatMul(scope, x, w1), b1));
auto layer_2 = Tanh(scope, Add(scope, MatMul(scope, layer_1, w2), b2));
auto layer_3 = Tanh(scope, Add(scope, MatMul(scope, layer_2, w3), b3));

Add L2 regularization.

// regularization
auto regularization = AddN(scope,
                         initializer_list<input/>{L2Loss(scope, w1),
                                                 L2Loss(scope, w2),
                                                 L2Loss(scope, w3)});

Finally, calculate the loss function, which computes the difference between the predicted price and the actual price y, and add regularization to the loss function.

// loss calculation
auto loss = Add(scope,
                ReduceMean(scope, Square(scope, Sub(scope, layer_3, y)), {0, 1}),
                Mul(scope, Cast(scope, 0.01,  DT_FLOAT), regularization));

Here, we have completed the forward propagation, and now it is time for backpropagation. The first step is to call the function to add gradient computations into the computation graph of the forward propagation operations.

// add the gradients operations to the graph
std::vector<output> grad_outputs;
TF_CHECK_OK(AddSymbolicGradients(scope, {loss}, {w1, w2, w3, b1, b2, b3}, &grad_outputs));
</output>

All operations need to compute the derivative of the loss function with respect to each variable and add it to the computation graph. We initialize grad_outputs as an empty vector, which will pass the gradients into the nodes when the TensorFlow session is opened. grad_outputs[0] will provide the derivative of the loss function with respect to w1, grad_outputs[1] will provide the derivative of the loss function with respect to w2, and this process will follow the order of {w1, w2, w3, b1, b2, b3} as the variables are passed to AddSymbolicGradients.

Now we have a series of nodes in grad_outputs, which will compute the gradient of the loss function with respect to a variable when used in the TensorFlow session. We need to use it to update the variables. So, we place a variable on each line and use the simplest method of gradient descent to update.

// update the weights and bias using gradient descent
auto apply_w1 = ApplyGradientDescent(scope, w1, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[0]});
auto apply_w2 = ApplyGradientDescent(scope, w2, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[1]});
auto apply_w3 = ApplyGradientDescent(scope, w3, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[2]});
auto apply_b1 = ApplyGradientDescent(scope, b1, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[3]});
auto apply_b2 = ApplyGradientDescent(scope, b2, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[4]});
auto apply_b3 = ApplyGradientDescent(scope, b3, Cast(scope, 0.01,  DT_FLOAT), {grad_outputs[5]});

The Cast operation is actually the parameter for the learning rate, which is 0.01 here.

The computation graph of our neural network has been built, and now we can open a session and run the computation graph. The Python-based Optimizers API basically encapsulates the loss function minimization method during computation and application. When the Optimizer API can be accessed in C++, we can use it here.

We initialize a ClientSession and a vector named outputs to receive the network’s output.

ClientSession session(scope);
std::vector<tensor> outputs;
</tensor>

Then, in Python, calling tf.global_variables_initializer() can initialize the variables, because the list of all variables is retained when building the computation graph. In C++, we must list the variables. Each RandomNormal output will be assigned to the variables defined in the Assign nodes.

// init the weights and biases by running the assigns nodes once
TF_CHECK_OK(session.Run({assign_w1, assign_w2, assign_w3, assign_b1, assign_b2, assign_b3}, nullptr));

At this point, we can iteratively update the parameters within the training count, which in our case is 5000 steps. The first step is to run the forward propagation part using the loss node, with the output being the loss of the network. We will record the loss value every 100 steps, and the decrease in loss is a sign of successful network operation. Then we must compute the gradient nodes and update the variables. Our gradient nodes are inputs to the ApplyGradientDescent nodes, so running apply_nodes will first compute the gradients and then apply them to the correct variables.

// training steps
for (int i = 0; i < 5000; ++i) {
  TF_CHECK_OK(session.Run({{x, x_data}, {y, y_data}}, {loss}, &outputs));
  if (i % 100 == 0) {
    std::cout << "Loss after " << i << " steps " << outputs[0].scalar<float>() << std::endl;
  }
  // nullptr because the output from the run is useless
  TF_CHECK_OK(session.Run({{x, x_data}, {y, y_data}}, {apply_w1, apply_w2, apply_w3, apply_b1, apply_b2, apply_b3, layer_3}, nullptr));
}
</float>

After the network has been trained to this extent, we can try to predict the price of a car—perform inference. Let’s try to predict the price of a BMW 1 Series car that is 7 years old, has a mileage of 110,000 kilometers, and has a diesel engine. To do this, we need to run the layer_3 node, inputting the car’s data into x, which is a forward propagation step. Since we have trained for 5000 steps, the weights have learned, so the output will not be random.

We cannot directly use the car’s attributes because our neural network has learned from normalized attributes, so the data must go through the same normalization process. The DataSet class has an input method that processes the metadata in the dataset reader.

// prediction using the trained neural net
TF_CHECK_OK(session.Run({{x, {data_set.input(110000.f, Fuel::DIESEL, 7.f)}}}, {layer_3}, &outputs));
cout << "DNN output: " << *outputs[0].scalar<float>().data() << endl;
std::cout << "Price predicted " << data_set.output(*outputs[0].scalar<float>().data()) << " euros" << std::endl;
</float></float>

The output value of the network is between 0 and 1; the output method of the data_set is also responsible for converting the value from the metadata back to a human-readable number. The model can be run using the command bazel run -c opt //tensorflow/cc/models:model, and if TensorFlow has just been compiled, you can see output in this form:

Loss after 0 steps 0.317394
Loss after 100 steps 0.0503757
Loss after 200 steps 0.0487724
Loss after 300 steps 0.047366
Loss after 400 steps 0.0460944
Loss after 500 steps 0.0449263
Loss after 600 steps 0.0438395
Loss after 700 steps 0.0428183
Loss after 800 steps 0.041851
Loss after 900 steps 0.040929
Loss after 1000 steps 0.0400459
Loss after 1100 steps 0.0391964
Loss after 1200 steps 0.0383768
Loss after 1300 steps 0.0375839
Loss after 1400 steps 0.0368152
Loss after 1500 steps 0.0360687
Loss after 1600 steps 0.0353427
Loss after 1700 steps 0.0346358
Loss after 1800 steps 0.0339468
Loss after 1900 steps 0.0332748
Loss after 2000 steps 0.0326189
Loss after 2100 steps 0.0319783
Loss after 2200 steps 0.0313524
Loss after 2300 steps 0.0307407
Loss after 2400 steps 0.0301426
Loss after 2500 steps 0.0295577
Loss after 2600 steps 0.0289855
Loss after 2700 steps 0.0284258
Loss after 2800 steps 0.0278781
Loss after 2900 steps 0.0273422
Loss after 3000 steps 0.0268178
Loss after 3100 steps 0.0263046
Loss after 3200 steps 0.0258023
Loss after 3300 steps 0.0253108
Loss after 3400 steps 0.0248298
Loss after 3500 steps 0.0243591
Loss after 3600 steps 0.0238985
Loss after 3700 steps 0.0234478
Loss after 3800 steps 0.0230068
Loss after 3900 steps 0.0225755
Loss after 4000 steps 0.0221534
Loss after 4100 steps 0.0217407
Loss after 4200 steps 0.0213369
Loss after 4300 steps 0.0209421
Loss after 4400 steps 0.020556
Loss after 4500 steps 0.0201784
Loss after 4600 steps 0.0198093
Loss after 4700 steps 0.0194484
Loss after 4800 steps 0.0190956
Loss after 4900 steps 0.0187508
DNN output: 0.0969611
Price predicted 13377.7 euros

The predicted car price here is 13,377.7 euros. Each predicted car price is different and can even range between 8,000 and 17,000. This is because we only used three attributes to describe the car, and our model architecture is relatively simple.

As previously mentioned, the development of the C++ API is still ongoing, and we hope that more functionalities can be added in the near future. Building Deep Neural Networks with C++ in TensorFlow

Original link: https://matrices.io/training-a-deep-neural-network-using-only-tensorflow-c/

Long press the QR code below for free subscription!

How to Join the Association

Register as a Member:

Individual Member:

Follow the association’s WeChat: China Command and Control Society (c2_china), reply “Individual Member” to get the membership application form, fill out the application form as required. If you have any questions, you can leave a message in the public account. After passing the association’s review, you can pay the membership fee online via Alipay.

Unit Member:

Follow the association’s WeChat: China Command and Control Society (c2_china), reply “Unit Member” to get the membership application form, fill out the application form as required, If you have any questions, you can leave a message in the public account. After passing the association’s review, you can pay the membership fee.

Long press the QR code below to follow the association’s WeChat

Building Deep Neural Networks with C++ in TensorFlow

Thank you for your attention

Leave a Comment Cancel reply