Legal and policy loopholes in the use of human-generated data for artificial intelligence training are increasingly being exploited by companies, with experts warning that India’s still-evolving data protection framework has left wide grey areas around consent, privacy, and the commercial use of recordings captured inside homes and workplaces.
The latest controversy surrounding domestic services start-up Pronto’s use of in-home recordings for AI training has brought the issue into sharp focus, but the practice itself is part of a much wider trend in which companies are collecting so-called “egocentric” or first-person human activity data to train robotics and embodied AI systems.
Pronto controversy sparks wider scrutiny
The Pronto episode triggered public scrutiny after investor documents cited in media reports said the company was generating data to train “physical AI and robotics systems” and was already piloting “real-world training data with leading physical AI labs”. The company later acknowledged that it was conducting a pilot involving recordings during domestic services such as dishwashing, folding laundry and cleaning, while maintaining that participation was voluntary and limited to 0.1 per cent of customers. The backlash also prompted rivals such as Urban Company and Snabbit to publicly distance themselves from any similar practices.
Legal experts say India is emerging as a hub for such data collection because of the availability of low-cost labour and relatively weak worker protection frameworks. At the same time, privacy experts point out that the law has not kept pace with the rapid expansion of AI training practices, especially when recordings are sourced from private homes or workplaces. Experts said the issue exposes a major regulatory vacuum. Sonam Chandwani, Managing Partner, KS Legal said the practice raises significant legal questions under India’s evolving privacy and data protection regime.
“Indian law presently does not contain a dedicated regulatory framework specifically governing AI training through domestic recordings, which is why the issue is being viewed as a grey area. However, the absence of a direct prohibition does not imply unrestricted legality. Companies engaging in such practices would still be required to satisfy existing standards relating to consent, transparency, lawful purpose, proportionality, storage limitation and data security under Indian privacy and consumer law frameworks,” she said.
Consent, anonymity and legal ambiguity
She pointed out that while the principal legislation governing the area is the Digital Personal Data Protection Act, 2023, the issue extends beyond whether a customer merely clicked “agree” on an app.
“The legal issue is not merely whether a consumer clicked ‘agree’ on an application, but whether the consumer was adequately informed that recordings from inside their home may be captured, stored, analysed or used for AI model development. The expectation of privacy inside a home is substantially greater, and any collection or commercial exploitation of such data would require a correspondingly higher level of disclosure, necessity and proportionality,” Chandwani added.
Tanisha Khanna. Partner, Trace Law Partners said the current regulatory position leaves substantial room for companies to legally use certain forms of data for AI model training.
“India’s Digital Personal Data Protection Act, 2023 is not in force as yet, and will be brought into force in May 2027. Till such time as it is notified, there are no enforceable statutory obligations on most personal data collection and processing activities, including at-home recordings,” she said.
Khanna added that even after the law comes into effect, its protections may not fully extend to anonymised or non-personal datasets.
“Even once the DPDPA comes into force, it will apply only to personal data, which is information that can identify an individual. Non-personal data, or personal data which is anonymized, for instance at-home recordings stripped of faces, voices or other identifiers, falls outside its scope, and can arguably be used for model training without any consent, opt out, and other safeguards,” she said.
Experts call for stronger safeguards
She further noted that the concepts of “informed” and “unconditional” consent themselves become difficult in the AI context because companies may not know or disclose all future use cases at the time data is collected.
“The requirement that consent be ‘informed’ creates additional complexity in the AI context, since all use cases may not be known and disclosed, for the consumer to make an informed decision. The requirement that consent be ‘unconditional’ would also need to be tested in this context, namely whether providing subsidized services and discounts in exchange for personal data collection or processing would be viewed as ‘unconditional’,” Khanna added.
Arjit Benjamin, Associate Partner, Prosoll Law said the absence of clear law does not diminish the privacy and consumer risks involved. “India’s law on such in-home recording and secondary AI reuse is still unclear, but the privacy and consumer-protection risks are real. Until clearer rules emerge, startups should adopt strict opt-in consent, data minimization, purpose limitation, short retention, and strong security safeguards, and should treat both raw recordings and derived AI datasets as sensitive,” he said.
Published on May 25, 2026






















