Answer the question
In order to leave comments, you need to log in
Restoring 3D objects using machine learning methods, what am I doing wrong?
Good evening. For 3 months I have been struggling with the task of restoring a 3D model of a hand from a photo.
Generated a sample of data - 10,000 images of a hand made in Blender, with different positions of the bones. As an output vector, I decided to take the positions of the vertices of the 3D model. (Yes, I understand that it was more reasonable to set the positions of the bones, but I decided to try this way)
I made the data augmentation in such a way that it is almost impossible to meet two identical images in the sample. (the image of a hand is superimposed on any other photo, and filters are applied so that the hand does not stand out against the general background. Noises that imitate not very good shooting quality are also added.) One of the photos looks something like this:
A model of a convolutional neural network was built in Keras (I can’t provide a picture, because I couldn’t install graphvis).
inp = Input(shape=(res,res,3))
bath_0 = BatchNormalization(axis=1)(inp)
x1 = Conv2D(primitives, kernel_size=(9, 9), border_mode='same', activation='relu')(bath_0)
pool_1 = MaxPooling2D(pool_size=(2, 2))(x1)
bath_1 = BatchNormalization(axis=1)(pool_1)
x2 = Conv2D(primitives*2, kernel_size=(3, 3), border_mode='same', activation='relu')(bath_1)
x3 = Conv2D(primitives*2, kernel_size=(3, 3), border_mode='same', activation='relu')(x2)
x4 = Conv2D(primitives*2, kernel_size=(3, 3), border_mode='same', activation='relu')(x3)
pool_2 = MaxPooling2D(pool_size=(2, 2))(x4)
bath_2 = BatchNormalization(axis=1)(pool_2)
x5 = Conv2D(primitives*4, kernel_size=(3, 3), border_mode='same', activation='relu')(bath_2)
x6 = Conv2D(primitives*4, kernel_size=(3, 3), border_mode='same', activation='relu')(x5)
x7 = Conv2D(primitives*4, kernel_size=(3, 3), border_mode='same', activation='relu')(x6)
pool_3 = MaxPooling2D(pool_size=(2, 2))(x7)
x8 = Flatten()(pool_3)
x9 = Dense(1700,activation='relu')(x8)
d_1 = Dropout(0.5)(x9)
x10 = Dense(1700,activation='relu')(d_1)
d_2 = Dropout(0.5)(x10)
x11 = Dense(1700 ,activation='relu')(d_2)
out = Dense(out_size,activation='tanh')(x11)
Answer the question
In order to leave comments, you need to log in
Maybe the dataset is too complex - start with more photos and less aggressive augmentations (you can start with just a black background).
Well, the architecture is not very suitable, read what they have been doing in this task in recent years https://github.com/xinghaochen/awesome-hand-pose-e...
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question