I've compiled the videos you've attached in Drive and run a demo. I can share the link in chat, as this proposal is public. The video shows all persons as and when they enter and leave the frame being accurately tracked. That is, all the persons irrespective of time of presence, shape, size, attire, gender are detected. This is the first step towards other attributes you want to notice and label.
On your video, it runs on 10 to 20 FPS depending on number of people. There is no upper bound to number of people it can detect. There is also detection in case of partial occlusions.
REGARDING THIS PROJECT
I've bid the budget and deadline, but they are negotiable based on your requirements and constraints.
TIMELINE
I shall create a TIMELINE pdf with all details, goals, milestone payments, deliverables, and deadlines, which you can follow to track progress.
Kindly initiate a chat to view the tracked results. It is 100MB, so I shall share the video link via cloud. I'm a research scholar in machine learning and computer vision and take projects in my domain. Thank you!