Image Detection 전형적인 과정

축구 장면에서 사람과 공을 찾는 Image Detection 프로젝트를 진행중이다. 관련하여 Image Detection의 전형적인 과정에 대해 아래와 같이 알아보았다.

This is the description of the typical algorithmic approach that could be used to find a soccer ball (or any specific object) in an image:

Resize the Image: The image is resized to a standard dimension to maintain consistency.
Color Space Conversion: Sometimes, the image is converted to a different color space like grayscale or HSV (Hue, Saturation, Value) to simplify the analysis.

Using Pre-trained Models: There are pre-trained models like YOLO (You Only Look Once), SSD (Single Shot MultiBox Detector), and Faster R-CNN that have been trained on extensive datasets to recognize various objects, including balls.
Custom Model: A custom model can also be trained for specific objects. In this case, a model could be trained specifically to recognize soccer balls.

Bounding Box: The object detection model will identify the location of the ball and create a bounding box around it.
Coordinates: The model will provide the coordinates of the bounding box, indicating the location of the ball.

Filtering: If multiple detections are made, filtering is done based on confidence scores to keep the detection with the highest confidence.
Display or Output: The final image with the bounding box around the ball is either displayed or the coordinates are outputted for further use.

TensorFlow and Keras: Popular for developing custom models.
OpenCV: Widely used for image processing and can be used in conjunction with deep learning models.

Video Stream: If applied to a video stream, the algorithm runs continuously, detecting the ball in each frame.
The actual process may involve more or fewer steps based on the specific algorithm or model used. A human, like me, reviewing the image, would locate the ball by recognizing its typical shape, size, and pattern, which are characteristic of soccer balls.

YOLO, SSD, Faster R-CNN 같은 pre-trained Model들을 사용해 빠른 속도와 사람과 공을 정확하게 찾아내는 능력간 trade-off 의 균형점을 찾는 것이 관건일듯 하다.

Faster R-CNN ResNet152 V1 800x1333 각 단어 의미 (0)	2023.11.06
SSD MobileNet V1 FPN 640x640 이름에서 FPN 의미 & 숫자 의미 (0)	2023.11.06
구글 인앱 결제 v3에서 v6로 변경하는 방법 (Google Play Billing Library v3 to v6) (0)	2023.10.31
애드몹 정책 위반 내용 확인, 해결, 검토요청 하기 (0)	2021.10.03
GitHub 프로젝트 비공개(Private)로 전환/설정 하기 (0)	2021.07.24

디지털 연금술사 연구소