G
G
geniusperson2018-12-22 01:27:53
Python
geniusperson, 2018-12-22 01:27:53

How to remove voice repetition during face recognition?

Hello!
It is necessary that when recognizing a face, say the name once:
For example: Hi, Ivan!
The code is completely written, but I don’t know how to make it just say once.
When you run the code it repeats every time.
This happens because the code inside the while true;
Real-time recognition is on, I can’t remove while true, you only need to recognize a person once and say it once.
Full code.

import cv2
import numpy as np
import os
from gtts import gTTS
import pygame
import io

recognizer = cv2.face.LBPHFaceRecognizer_create()
recognizer.read('trainer/trainer.yml')
cascadePath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascadePath);

font = cv2.FONT_HERSHEY_SIMPLEX

#iniciate id counter
id = 0

# names related to ids: example ==> Marcelo: id=1,  etc
names = ['None', 'Ivan', 'Vasiliy', 'Ilza', 'Z', 'W']



# Initialize and start realtime video capture
cam = cv2.VideoCapture(0)
cam.set(3, 640) # set video widht
cam.set(4, 480) # set video height

# Define min window size to be recognized as a face
minW = 0.1*cam.get(3)
minH = 0.1*cam.get(4)


def speek(a):
    tts = gTTS(text='Привет' + "" + a + "", lang='ru')
    with io.BytesIO() as f:
        tts.save('text.mp3')

        # инициализация pygame
        pygame.mixer.init()
        pygame.init()

        # загружаем речь из mp3 файла
        pygame.mixer.music.load('text.mp3')
        pygame.mixer.music.play()

        # music.play() — неблокирующий метод
        # код ниже будет ждать, пока речь закончит произносится
        pygame.mixer.music.set_endevent(pygame.USEREVENT)
        pygame.event.wait()

while True:

    ret, img =cam.read()
    img = cv2.flip(img, 1) # Flip vertically

    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

    faces = faceCascade.detectMultiScale( 
        gray,
        scaleFactor = 1.2,
        minNeighbors = 5,
        minSize = (int(minW), int(minH)),
       )

    for(x,y,w,h) in faces:

        cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)

        id, confidence = recognizer.predict(gray[y:y+h,x:x+w])

        a = names[id]
        # Check if confidence is less them 100 ==> "0" is perfect match
        if (confidence < 100):
            id = names[id]
            confidence = "  {0}%".format(round(100 - confidence))
            speek(a)
        else:
            id = "unknown"
            confidence = "  {0}%".format(round(100 - confidence))
        
        cv2.putText(img, str(id), (x+5,y-5), font, 1, (255,255,255), 2)
        cv2.putText(img, str(confidence), (x+5,y+h-5), font, 1, (255,255,0), 1)  
    
    cv2.imshow('camera',img) 

    k = cv2.waitKey(10) & 0xff # Press 'ESC' for exiting video
    if k == 27:
        break

# Do a bit of cleanup
print("\n [INFO] Exiting Program and cleanup stuff")
cam.release()
cv2.destroyAllWindows()

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladimir Kuts, 2018-12-22
@fox_12

So use a flag for example, or keep a list of recognized users...

detected = set()
while True:
     ...
     if (confidence < 100) and not a in detected:
           ...
           speak(a)
           detected.add(a)

R
rPman, 2018-12-22
@rPman

Store in an array (by key - the spoken name) the time of the last recognition, and each time it is found, compare it with the current one, if the difference is less than a certain constant, do not say it.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question