189 points by alexwlchan 5 months ago flag hide 26 comments
user7 5 months ago next
I agree! It's always exciting to see new and innovative applications of machine learning.
john_doe 5 months ago prev next
Great project! I've been looking for something like this for a while.
jane_doe 5 months ago next
I know, right? It could save a lot of time reviewing student assignments. Upvoted!
deep_learning_fan 5 months ago prev next
I'm curious about the model you used. Could you please share some details about the architecture and training process?
author 5 months ago next
Sure, I used a simple CNN with some text preprocessing. I can provide more details if there's interest.
another_user 5 months ago prev next
What's the accuracy of the model? I'm wondering how well it would perform in practice.
author 5 months ago prev next
I haven't tested it extensively yet, but the preliminary results look promising. I'll make sure to do more testing and provide the results in the future.
helpful_commenter 5 months ago prev next
Have you thought about using the model to flag potential violations of academic integrity policies? That could be a really impactful use case.
author 5 months ago next
Yes, I've considered that. It's definitely an interesting application of the technology. However, I think it's important to proceed with caution and ensure that the model is used responsibly.
automation_skeptic 5 months ago prev next
Do you think this type of automation could lead to false positives and undeserved penalties for students? I'm concerned about the potential impact on learners.
author 5 months ago next
That's a valid concern. I think it's important to thoroughly test the model and ensure that it's accurate and reliable before using it in any kind of high-stakes situation. And as I mentioned before, I think it's important to use the model responsibly.
user1 5 months ago prev next
Nice work! I've always been fascinated by applying machine learning to real-world problems.
user2 5 months ago next
Agreed! It's amazing what can be accomplished with the right blend of creativity and technical expertise.
user3 5 months ago next
I'm wondering if the model could be retrained on other types of code to detect other common patterns or characteristics. For example, could it be used to detect code that is particularly well-organized or well-documented?
author 5 months ago next
That's an interesting idea. I haven't explored that specific use case, but I'm open to the possibility. I think there's a lot of potential for this type of technology.
user4 5 months ago prev next
I'm curious about the dataset you used to train the model. Where did you find it, and how did you ensure that it was representative and unbiased?
author 5 months ago next
I created the dataset myself by scraping public repos from GitHub. I tried to include a diverse range of topics and programming languages, but of course there is always the potential for some bias. I did my best to ensure that the dataset was balanced and representative, but I'm open to feedback and suggestions for improvement.
user5 5 months ago prev next
How do you deal with repos that contain a mix of homework assignments and other content? It seems like there could be a lot of false positives in those cases.
author 5 months ago next
That's a great point. Currently, the model looks for specific patterns and features that are common in homework assignments. If a repo contains both homework and other content, there is a possibility of false positives. However, I'm working on improving the model to better handle those cases. I appreciate the feedback!
user6 5 months ago prev next
I'm impressed by the creativity and ingenuity of this project. I'm excited to see where it goes and how it evolves in the future.
user8 5 months ago prev next
I'm curious if you've thought about using the model to detect other types of code patterns or behaviors, such as writing insecure code or violating best practices. That could be a really valuable tool for educators and developers alike.
author 5 months ago next
That's a great idea. I'm definitely interested in exploring that possibility. I think there's a lot of potential for this technology to help developers and learners improve their code and avoid common pitfalls.
user9 5 months ago prev next
I'm wondering if the model could be adapted to work with other version control systems, such as SVN or Mercurial. That would make it even more versatile and widely applicable.
author 5 months ago next
I haven't explored that specific use case yet, but I'm definitely open to the possibility. I think it would require some modifications to the model and the dataset, but it's certainly worth considering. Thanks for the idea!
user10 5 months ago prev next
I'm curious how well the model generalizes to new repos and codebases. Have you tested it on a wide variety of datasets, or only on the one you used to train it?
author 5 months ago next
I've done some testing on new datasets, and the results look promising so far. However, I agree that more extensive testing is needed to ensure that the model generalizes well to a wide range of codebases. I'll definitely prioritize that in the future.