This document is aimed at users who need to provide backward compatibility for different versions of TensorFlow (whether code or data) and developers who wish to change TensorFlow while maintaining compatibility.
Semantic Versioning 2.0
The public API of TensorFlow follows Semantic Versioning 2.0 (semver). Each TensorFlow version number is in the format MAJOR.MINOR.PATCH. For example, the MAJOR version of TensorFlow version 1.2.3 is 1, the MINOR version is 2, and the PATCH version is 3. The changes in each number have the following meanings:
-
MAJOR: Changes included may have backward-incompatible effects. Code and data that worked with previous Major versions may not work with the new Major version. However, in some cases, existing TensorFlow graphs and checkpoints may be migrated to the new version; for detailed information on data compatibility, see the compatibility section for graphs and checkpoints below.
-
MINOR: Features included are backward-compatible, with performance improvements, etc. Code and data that relied solely on the public API from previous Minor versions will continue to work with the new Minor version. For detailed information on which APIs are public and which are not, see the coverage section below.
-
PATCH: Bug fixes included are backward-compatible.
For example, the changes included in version 1.0.0 are backward-incompatible compared to version 0.12.1. However, version 1.1.1 is backward-compatible compared to version 1.0.0.
Coverage
Only the public API of TensorFlow is backward-compatible between minor and patch versions. The public API includes:
1. All Python functions and classes described in the tensorflow module and its submodules, except for the following functions and classes:
-
Functions and classes in tf.contrib
-
Functions and classes that start with _ (as they are private functions and classes). Note that the code in the examples/ and tools/ directories cannot be accessed through the tensorflow Python module, and therefore is not covered by compatibility guarantees.
If a symbol is accessible through the tensorflow Python module or its submodules but is not documented, it is not part of the public API.
2. C API
3. The following protocol buffer files:
-
attr_value
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto)
-
config
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto)
-
event
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto)
-
graph
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto)
-
op_def
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto)
-
reader_base
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/reader_base.proto)
-
summary
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto)
-
tensor
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto)
-
tensor_shape
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor_shape.proto)
-
types
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/types.proto)
Non-Covered APIs
Some API functions are explicitly marked as ‘experimental’ and can change in a backward-incompatible way between Minor versions. These APIs include:
-
Experimental APIs: Functions and classes in the tf.contrib module and its submodules in Python, as well as any fields in the C API that are explicitly annotated as experimental. Specifically, any fields in the protocol buffers labeled as ‘experimental’ can change at any time.
-
APIs in Other Languages: TensorFlow APIs developed in languages other than Python and C, such as APIs in the following languages:
C++ (provided through header files in tensorflow/cc)
Java
Go
-
Details of Composite Operations: Many public functions in Python can be expanded into multiple primitive operations in the graph; these details will be part of any graph saved to disk in the form of GraphDef. These details may change between Minor versions. Specifically, regression tests that check for complete matches between graphs may fail after Minor version updates, although the behavior of the graphs will remain unchanged and existing checkpoints will still work.
-
Floating-Point Value Details: Specific floating-point values computed by operations may change at any time. Users should rely only on approximate accuracy and numerical stability, rather than specific computed values. Changes to numerical formulas in Minor and Patch versions should improve accuracy. However, in machine learning, if the accuracy of a specific formula improves, it may lead to a decrease in the overall accuracy of the system.
-
Random Numbers: Specific random numbers computed by random operations may change at any time. Users should rely only on approximate distributions and statistical strength, rather than specific computed values. However, we rarely (if ever) change random bits in Patch versions. Of course, all these changes will be documented.
-
Version Bias in Distributed TensorFlow: Running two different TensorFlow versions in a single cluster is not supported. We make no guarantees regarding backward compatibility for line protocol.
-
Errors: If the current implementation has obvious issues, that is, if the current implementation contradicts the documentation, or the typical and well-defined expected behavior is not correctly implemented due to a bug, we reserve the right to make backward-incompatible changes (but not API) to behavior. For example, if an optimizer claims to implement a well-known optimization algorithm but does not match that algorithm due to a bug, we will fix the optimizer. Our fix may break code that relies on erroneous convergence behavior. Such changes will be noted in the release notes.
-
Error Messages: We reserve the right to change the text of error messages. Additionally, error types may change unless that type has been specified in the documentation. For example, functions documented to raise InvalidArgument exceptions will continue to raise InvalidArgument, but the user-readable message content may change.
Compatibility of Graphs and Checkpoints
Users sometimes need to save graphs and checkpoints. A graph describes the data flow of ops to be run during training and inference, while a checkpoint contains the tensor values of variables saved in the graph.
Many TensorFlow users save graphs and trained models to disk for future evaluation or further training but ultimately run the saved graph or model on a higher version. According to semantic versioning, any graph or checkpoint written with a certain version of TensorFlow can be loaded and evaluated using a higher (minor or patch) version of TensorFlow within the same major version. However, we will continue to strive for backward compatibility across different major versions so that serialized files can be used over a long period.
Graphs are serialized through the GraphDef protocol buffer. To facilitate (rare) backward-incompatible changes to graphs, each GraphDef has a version number independent of the TensorFlow version. For example, GraphDef version 17 supports reciprocal while deprecating inv op. The semantics are:
-
Every version of TensorFlow supports intervals of GraphDef versions. This interval remains unchanged between patch versions and only increases between minor versions. Only Major versions of TensorFlow will drop support for GraphDef versions.
-
Newly created graphs will be assigned the latest GraphDef version number.
-
If a certain TensorFlow version supports a GraphDef version of a graph, it will be loaded and evaluated with the same behavior as the TensorFlow version used to generate it (except for floating-point value details and random numbers), regardless of what major version of TensorFlow it is. Specifically, all checkpoint files are compatible.
-
If the GraphDef upper limit is increased to X in a (minor) version, then the lower limit cannot be increased to X for at least six months. For example (using fictitious version numbers here):
TensorFlow 1.2 may support GraphDef versions 4 to 7
TensorFlow 1.3 can add GraphDef version 8 and support versions 4 to 8
At least six months later, TensorFlow 2.0.0 may drop support for versions 4 to 7 and only support version 8.
Finally, after dropping support for a certain GraphDef version, we will attempt to provide relevant tools to help users automatically convert their graphs to the updated supported GraphDef versions.
Compatibility of Graphs and Checkpoints When Extending TensorFlow
This section only relates to incompatible changes to the GraphDef format, such as adding ops, removing ops, or changing the functionality of existing ops. For most users, reading the previous section will suffice.
Backward Compatibility and Partial Forward Compatibility
Our version control scheme has three requirements:
-
Backward compatibility to support loading graphs and checkpoints created with older versions of TensorFlow.
-
Forward compatibility to support the situation where the provider of the graph or checkpoint has upgraded to a newer version of TensorFlow before the consumer.
-
The ability to improve TensorFlow in a backward-incompatible way. For example, removing ops, adding attributes, and removing attributes.
Please note that although the GraphDef version mechanism is independent of the TensorFlow version, backward-incompatible changes to the GraphDef format are still subject to semantic version control. This means that functionality can only be removed or changed between Major versions of TensorFlow (e.g., from 1.7 to 2.0). Additionally, forward compatibility is implemented between patch versions (e.g., from 1.x.1 to 1.x.2).
To achieve backward and forward compatibility, as well as to understand when to implement format changes, graphs and checkpoints have metadata describing when they were generated. Subsequent sections detail the TensorFlow implementations and guidelines used to improve GraphDef versions.
Independent Data Version Scheme
Graphs and checkpoints have different data versions. The rate of improvement between these two data formats is different and also differs from TensorFlow. Both version control schemes are defined in core/public/version.h. Each time a new version is added, comments are incremented in the header detailing what has changed and when.
Data, Providers, and Consumers
The following distinctions exist in data version information: *Provider: The binary file that generates the data. The provider has a version (producer) and a minimum compatible consumer version (min_consumer). *Consumer: The binary file that uses the data. The consumer has a version (consumer) and a minimum compatible producer version (min_producer).
Each versioned data segment has a VersionDef versions field to record the producer that created the data, the min_consumer that is compatible with it, and a list of unsupported bad_consumers versions.
By default, when a provider creates data, it inherits the provider’s producer and min_consumer versions. If it is known that a specific consumer version contains bugs that must be avoided, bad_consumers can be set. If a consumer wants to accept a piece of data, the following conditions must be met:
-
consumer >= min_consumer of the data.
-
producer of the data >= min_producer of the consumer.
-
consumer is not in the bad_consumers list of the data.
Since both providers and consumers come from the same TensorFlow codebase, core/public/version.h contains a major data version (considered as a producer or consumer depending on context), as well as min_consumer and min_producer (required by the provider and consumer, respectively). The specifics are as follows:
-
For GraphDef versions, we have TF_GRAPH_DEF_VERSION, TF_GRAPH_DEF_VERSION_MIN_CONSUMER, and TF_GRAPH_DEF_VERSION_MIN_PRODUCER.
-
For checkpoint versions, we have TF_CHECKPOINT_VERSION, TF_CHECKPOINT_VERSION_MIN_CONSUMER, and TF_CHECKPOINT_VERSION_MIN_PRODUCER.
Adding New Attributes with Default Values to Existing Operations
To provide forward compatibility, the following guidelines can only be followed if the operation set does not change.
-
If forward compatibility is required, set strip_default_attrs to True and use the add_meta_graph_and_variables and add_meta_graph methods of the SavedModelBuilder class or Estimator.export_savedmodel to export the model.
-
This will strip default value attributes when generating/exporting the model. This ensures that when using default values, the exported tf.MetaGraphDef does not contain new operation attributes.
-
This control allows outdated consumers (e.g., consumers lagging behind training binaries) to continue loading models and prevents interruptions when using the model.
Continuously Improving GraphDef Versions
This section describes how to make different types of changes to the GraphDef format using this version control mechanism.
Adding Operations
Adding new operations to both consumers and providers without changing any GraphDef version. This type of change is automatically backward-compatible and does not affect the forward compatibility plan, as existing provider scripts will not suddenly use new features.
Adding operations and switching existing Python wrapper containers to use that operation.
-
Implement new consumer functionality and increment the GraphDef version.
-
If the operation can add new functionality without breaking changes, update the Python wrapper container.
-
Change the Python wrapper container to use the new functionality. Do not increment min_consumer, as models that do not use this operation will not break.
Removing or Limiting the Functionality of Operations
-
Fix all provider scripts (not TensorFlow itself) to avoid using disabled operations or functionalities.
-
Increment the GraphDef version and implement new consumer functionality to disable the removed operations or GraphDef functionalities in new and higher versions. If possible, let TensorFlow stop generating GraphDefs with disabled functionalities. To do this, add REGISTER_OP(…).Deprecated(deprecated_at_version, message).
-
For backward compatibility, wait for the related Major version.
-
Increase the min_producer to the GraphDef version in (2) and completely remove that functionality.
Changing the Functionality of Operations
-
Add a similar new operation named SomethingV2 or similar, complete the process of adding that operation, and switch existing Python wrapper containers to use that operation. To ensure forward compatibility, check using the recommended method in compat.py when changing the Python wrapper container.
-
Remove the old operation (due to backward compatibility, this can only follow Major version changes).
-
By increasing min_consumer to exclude consumers still using the old operation, re-add the old operation and alias it as SomethingV2, and switch existing Python wrapper containers to use that operation.
-
Complete the process of removing SomethingV2.
Disabling Individual Unsafe Consumer Versions
-
Raise the GraphDef version and add bad versions to the bad_consumers list for all new GraphDefs. If possible, only add bad versions to the bad_consumers list for GraphDefs containing specific or similar operations.
-
If an existing consumer is a bad version, deprecate it as soon as possible.
More AI Reading:
-
Google I/O 2019: Becoming Google for Everyone
-
Creating Image Captions Using Attention Mechanisms
-
Feel free to try tf.function to accelerate your code