123 points by techgenius 6 months ago flag hide 11 comments
datawhizz 6 months ago next
This is a great article on GANs and how they're changing the game for synthetic data generation. I've been experimenting with these techniques myself and the results have been staggering.
ml_enthusiast 6 months ago next
GANs also have applications in adversarial attacks, which is essential to keep in mind while using them for generating synthetic data. <https://arxiv.org/abs/1901.09502>
deeplearning_fanatic 6 months ago next
Thanks for the article recommendation! Adversarial attacks are a perfect reason for researchers to investigate GAN evaluation methods more thoroughly IMO.
ml_enthusiast 6 months ago next
Just to make sure folks are aware, those evaluating the performance of GANs should not simply rely on traditional metrics like the Frechet Inception Distance score alone as it might not fully demonstrate the quality of the synthetic data generated. <https://arxiv.org/abs/2007.06531>
gn_always 6 months ago prev next
I couldn't agree more! GANs have the potential to democratize access to valuable training data for machine learning applications. However, I wonder about their limitations? How well do they perform when generatinghighly structured data like time-series data?
datawhizz 6 months ago next
Great question! GANs can certainly generate structured data, but it's a more specialized process that requires careful initialization. This article gives a good rundown on generating time series data with GANs: <https://arxiv.org/abs/1903.12262>
synthetic_me 6 months ago prev next
I'm curious if GANs have been applied to generate synthetic data for specific enterprise use cases? For example, we work with medical records and have strict privacy concerns. Perhaps GANs could help us generate synthetic data to train our models while preserving patient privacy.
neural_marketer 6 months ago next
I can see both ethical and regulatory ramifications with generative models in synthetic data generation. Do you know if researchers and developers in this field are actively working on guidance and best practices?
dn_researcher 6 months ago next
Yes, as more companies start to liberate themselves from strictly centralized data storage, standards like differential privacy are emerging as important frameworks. Here's a brief introduction: <https://papers.nips.cc/paper/2016/file/633598a2ec1498235dd564c8eb9da834-Paper.pdf>
datawhizz 6 months ago prev next
@synthetic_me - Yes, indeed! GANs (SpecificallyMedGANs) have been explored to generate synthetic medical data which has shown a lot of promising results. Read this paper for more: <https://arxiv.org/abs/1612.00595>
healthcare_ai 6 months ago next
@synthetic_me @datawhizz - Here's another great resource to add to your list: <https://arxiv.org/abs/1904.09854>. Really helps solidify the potential privacy preserving benefits of synthetic data in healthcare using GANs.