Research tell us that one of the best methods that helps humans develop enduring understandings is the application of knowledge into a project (a form of project-based learning). When we apply what we know into a project we learn it deeper, we can recall it later and it becomes a tool we can keep applying.
Schools don’t use this methodology, and one of the main causes is that it’s very hard let each student to create a unique project because of teachers’ bandwidth. The few schools that try to use project-based learning use it either as a toy project that the teachers ask students to do and that in most cases has nothing to do with how that knowledge is use in the real world. Or as a separate class called “making” where students create their own personal projects but it’s normally completely disconnected from formal learning. Even when both activities are steps into the right direction, the first type of activity is much less engaging, real, and doesn’t support the development of identity and creativity while the second one is less about applying knowledge and more about letting the kids use their hands and creativity in a less structured way.
After training teachers in project-based learning all around the world and be in the founding team of a school where learning is 100% project-based learning (PBL) I learned that there is a list of things that can substantially help teachers and students to do PBL. The first is to know what kind of projects can be done with a specific set of tools, materials and time, the second is what skills are necessary for a student to be able to do a project, the third what knowledge will be applied in doing such a project and finally what can be done when a stuck on an specific part of project. But even if we have these information available for teachers it’s very time consuming for the teacher to be looking at all these information per student’s project. To give more agency and support the development of students’ independency I propose the develop of a webapp, Cielo, that students can use by themselves.
Cielo will ask the user for at least 6 pictures of the current status of the project, a description of the overall project, a description of the current status, the category of the project and the tools and material. Using picture captioning, supervised learning and NLP, Cielo will return a list of projects that are similar to the one the student wants to make, a list of resources for the skills needed to develop that project, a list of resources based on the Next Generation Science Standards related with the scientific ideas of the project and the caption in English for their pictures. The webapp is intended to work even when with very little text, as long as there are enough pictures. In this way even young children or non-English speaker children can use Cielo.
Data: Pluming v1
I’ll be using articles from Make Magazine (thanks a lot to Dale for it), Hackster and Instructables. The data are articles that explain how to make different types of projects, from IoT, to cooking, arts-and-craft or robotics. The projects vary on time, expertise, materials, skills, tools and topics. All of them have pictures, a description of the project, a category and a list of steps to make the project.
I started with Make Magazine data because they gave me an XML file (thanks again!). So I parsed all the fields I cared about (tools, description, steps, conclusion, urls of pictures, level, categories, and duration of the project), and decided to use only the published articles (~2750). I realized that some of the labels were too general or didn’t correspond to what I needed. I created my own labels, map some of the current labels into them and relabeled around 900 articles.
I’ll be using the pictures to get some information about the category of the projects and the different parts of the process the project. I also want to caption the pictures and use that information. Finally, I’ll be using the text to also predict the category and the parts of the process of the project.
After parsing all the text I wanted to create a simple model, a Bag of Words (BoW). My text is not all clean so the model BoW can show me how good the data is. I used Keras and Spacy to train a very simple model with two dense layers and adam as optimizer. The accuracy was from 0.2219 to 0.6760 and the loss started as 4.8035 to 1.4918 after 100 epochs. Given that there are 20 labels and it’s a multilabel unbalance set of data it’s not that horrible.