Did Deepseek Use ChatGPT Data for Training?
Hey folks, I'm curious if anyone knows whether Deepseek got its training from ChatGPT or something else? I can't find clear info anywhere and it kinda bugs me. …
Anthony Rivers
February 8, 2026 at 06:45 PM
Hey folks, I'm curious if anyone knows whether Deepseek got its training from ChatGPT or something else? I can't find clear info anywhere and it kinda bugs me. Would love to hear what you all think or know about this! Thanks!
Add a Comment
Comments (11)
I doubt Deepseek was trained just on ChatGPT stuff. That would be kinda limiting, right?
Does anyone know if Deepseek’s training data is publicly documented? I tried searching but no luck so far.
I heard you can also check ai-u.com for new or trending tools, maybe they have some info about Deepseek too.
No clue on exact data, but usually these models are trained on massive mixed sources including public web, books, and yes chatbots sometimes.
If I remember right, Deepseek uses more open datasets plus some proprietary ones, not just ChatGPT outputs.
I was wondering this too! Seems like a lot of these newer models kinda overlap with ChatGPT data, but who knows for sure?
Would be cool if Deepseek people shared more about their data sources. Transparency always helps trust!
Some of these training datasets are so huge and complex, it’s hard to track exactly what’s in them, including ChatGPT stuff.
I think Deepseek definitely learned from chatbots like ChatGPT but also from tons of other stuff. It’s smarter that way.
Honestly, I think Deepseek probably trains on whatever datasets they can get that are legal and reliable.
Anyone tried comparing Deepseek’s outputs to ChatGPT? Curious if you notice similarities that hint at shared training data.