Education is changing. New methodologies such as project-based learning (PBL) and making activities have opened their ways into schools. With these new learning experiences new types of data is being generated: artifacts, with video and pictures of them, project documentation, diagrams, etc. What can this new data tell us about the new learning experiences? How can we interpret the data?
One of my goals for the next three weeks is to explore how much information can be gathered from children's eportfolios regarding their understanding, how well the models of vision and natural language processing (NLP) work with children's data and how much data is needed to make those models work.
This week I tried to classify pictures of children's school work into three classes: worksheets, diagrams and projects. I trained the model using FastAi API and gathered the data from google pictures: children's diagrams, worksheets and projects. Then I used that model to predict the same classes on a test set from a school with 1000 pictures. The result was inconsistent. Everything that looked a little bit 2D and had many colors was identified as a diagram, and many actual diagrams were classified as worksheets. I believe the main problem is to find data to train the model. You can see my experiment here. I might have to label the data by myself... :P