Click on the above “3D Vision Workshop“, select “Star”
Delivering valuable content promptly
-
N-4: It represents the pixels located above, below, to the right, and to the left of the reference pixel. For pixel “e”, N-4 contains “b”, “f”, “h”, and “d”. -
ND: It represents the pixels accessible from the diagonal of the reference pixel. For pixel “e”, ND contains “a”, “c”, “i”, and “g”. -
N-8: It represents all pixels surrounding it. It includes both N-4 and ND pixels. For pixel “e”, N-8 contains “a”, “b”, “c”, “d”, “f”, “g”, “h”, and “i”.
Dataset Overview

Importing Modules
import numpy as np
import h5py
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.initializers import Constant
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
Loading the Dataset
xtrain, xtest = dataset["X_train"][:], dataset["X_test"][:]
ytrain, ytest = dataset["y_train"][:], dataset["y_test"][:]
xtrain = np.array(xtrain)
xtest = np.array(xtest)
print('train shape:', xtrain.shape)
print('test shape:', xtest.shape)
xtrain = xtrain.reshape(xtrain.shape[0], 16, 16, 16, 1)
extest = xtest.reshape(xtest.shape[0], 16, 16, 16, 1)
ytrain, ytest = to_categorical(ytrain, 10), to_categorical(ytest, 10)
train shape: (10000, 4096)
test shape: (2000, 4096)
Building the 3D-CNN
model = Sequential()
model.add(layers.Conv3D(32,(3,3,3),activation='relu',input_shape=(16,16,16,1),bias_initializer=Constant(0.01)))
model.add(layers.Conv3D(32,(3,3,3),activation='relu',bias_initializer=Constant(0.01)))
model.add(layers.MaxPooling3D((2,2,2)))
model.add(layers.Conv3D(64,(3,3,3),activation='relu'))
model.add(layers.Conv3D(64,(2,2,2),activation='relu'))
model.add(layers.MaxPooling3D((2,2,2)))
model.add(layers.Dropout(0.6))
model.add(layers.Flatten())
model.add(layers.Dense(256,'relu'))
model.add(layers.Dropout(0.7))
model.add(layers.Dense(128,'relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10,'softmax'))
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv3d_5 (Conv3D) (None, 14, 14, 14, 32) 896
_________________________________________________________________
conv3d_6 (Conv3D) (None, 12, 12, 12, 32) 27680
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 6, 6, 6, 32) 0
_________________________________________________________________
conv3d_7 (Conv3D) (None, 4, 4, 4, 64) 55360
_________________________________________________________________
conv3d_8 (Conv3D) (None, 3, 3, 3, 64) 32832
_________________________________________________________________
max_pooling3d_3 (MaxPooling3 (None, 1, 1, 1, 64) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 1, 1, 1, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 64) 0
_________________________________________________________________
dense_3 (Dense) (None, 256) 16640
_________________________________________________________________
dropout_5 (Dropout) (None, 256) 0
_________________________________________________________________
dense_4 (Dense) (None, 128) 32896
_________________________________________________________________
dropout_6 (Dropout) (None, 128) 0
_________________________________________________________________
dense_5 (Dense) (None, 10) 1290
=================================================================
Total params: 167,594
Trainable params: 167,594
Non-trainable params: 0
Training the 3D-CNN
model.compile(Adam(0.001),'categorical_crossentropy',['accuracy'])
model.fit(xtrain,ytrain,epochs=200,batch_size=32,verbose=1,validation_data=(xtest,ytest),callbacks=[EarlyStopping(patience=15)])
Epoch 1/200
313/313 [==============================] - 39s 123ms/step - loss: 2.2782 - accuracy: 0.1237 - val_loss: 2.1293 - val_accuracy: 0.2235
Epoch 2/200
313/313 [==============================] - 39s 124ms/step - loss: 2.0718 - accuracy: 0.2480 - val_loss: 1.8067 - val_accuracy: 0.3395
Epoch 3/200
313/313 [==============================] - 39s 125ms/step - loss: 1.8384 - accuracy: 0.3382 - val_loss: 1.5670 - val_accuracy: 0.4260
...
...
Epoch 87/200
313/313 [==============================] - 39s 123ms/step - loss: 0.7541 - accuracy: 0.7327 - val_loss: 0.9970 - val_accuracy: 0.7061
Testing the 3D-CNN
_, acc = model.evaluate(xtrain, ytrain)
print('training accuracy:', str(round(acc*100, 2))+'%')
_, acc = model.evaluate(xtest, ytest)
print('testing accuracy:', str(round(acc*100, 2))+'%')
313/313 [==============================] - 11s 34ms/step - loss: 0.7541 - accuracy: 0.7327
training accuracy: 73.27%
63/63 [==============================] - 2s 34ms/step - loss: 0.9970 - accuracy: 0.7060
testing accuracy: 70.61%
Conclusion
-
The neighborhoods of pixels in images -
Why ANN performs poorly on image datasets -
The differences between CNN and ANN -
How CNN works -
Building and training a 3D-CNN in TensorFlow
This article is for academic sharing only. If there is any infringement, please contact for deletion.
Download and learn valuable content
Reply in the background:BarcelonaAutonomous University courseware, to download the exquisite courseware of 3D Vision accumulated by foreign universities over the years
Reply in the background:Computer Vision books, to download classic books in the field of 3D vision pdf
Reply in the background:3D Vision course, to learn exquisite courses in the field of 3D vision
Official website of the 3D Vision Workshop exquisite course:3dcver.com
1. Multi-sensor data fusion technology for autonomous driving
9. Build a structured light 3D reconstruction system from scratch [theory + source code + practice]
Heavyweight!3DCVer-Academic paper writing submission WeChat group has been established
Scan to add the assistant’s WeChat, you can apply to join the 3D Vision Workshop – Academic paper writing and submission WeChat group, aimed at exchanging writing and submission matters for top conferences, top journals, SCI, EI, etc.
At the same time, you can also apply to join our subdivided direction group chat, currently mainly including3D Vision、CV&Deep Learning、SLAM、3D Reconstruction、Point Cloud Post-processing、Autonomous Driving, Multi-sensor Fusion, CV Entry, 3D Measurement, VR/AR, 3D Face Recognition, Medical Imaging, Defect Detection, Person Re-identification, Object Tracking, Visual Product Implementation, Visual Competitions, License Plate Recognition, Hardware Selection, Academic Exchange, Job Exchange, ORB-SLAM series source code exchange, Depth Estimation and other WeChat groups.
Please be sure to note: Research direction + School/Company + Nickname, for example: “3D Vision + Shanghai Jiao Tong University + Jingjing”. Please follow the format to be quickly approved and invited into the group. Original submissions should also contact us.

▲ Long press to add WeChat group or submit

▲ Long press to follow the public account
▲ Long press to follow the public account
3D Vision from Entry to Mastery Knowledge Planet: Targeting the 3D Vision field video courses (3D Reconstruction series、3D Point Cloud series、Structured Light series、Hand-Eye Calibration、Camera Calibration、Laser/Vision SLAM, Autonomous Driving, etc.),Knowledge point summary, entry and advanced learning routes, latest paper sharing, Q&A Five aspects for in-depth cultivation, with various algorithm engineers from large companies providing technical guidance. Meanwhile, the planet will cooperate with well-known enterprises to release 3D vision-related algorithm development positions and project docking information, creating a fan gathering area that integrates technology and employment, with nearly 5,000 planet members working together to create a better AI world.
Learn core technologies of 3D Vision, scan to view the introduction, unconditional refund within 3 days

There are high-quality tutorial materials, Q&A to help you efficiently solve problems
If you find it useful, please give a thumbs up and watch~
