Academic Success is an established term in education and assessment that has undergone many changes. While some define ‘academic success’ in terms of standard measures like grades in a series of exams, others are inclined to use the term broadly. There has been a rapid expansion of studies to identify measures that show academic success is not just the marks obtained in an exam but also the learning and holistic development of a student, including improvement in a student’s attitude towards any exam or academic problems, educational or otherwise. At Embibe, we have already developed various parameters to measure a student’s performance in tests like Embibe Score Quotient. We have used standardised models like Concept Mastery that rely heavily on the Bayesian Knowledge Tracing Algorithm.
Recently, we developed a new metric – the Sincerity Score that measures a student across three parameters and allocates a behaviour or a combination of behaviours.
- Accuracy: Percentage of questions the student got right to the total questions attempted by the student in the test.
- Attempt Percentage: Percentage of questions attempted by the student to the total questions in the test.
- Time Percentage: Percentage of time taken by the student to the total time allocated to the test.
Each parameter is classified into different partitions that culminate in 10 unique behaviours – four positive, five negative, and one neutral behaviour. A rank is allocated to each behaviour, from the best to the worst possible behaviour. Over 2.5 million valid test sessions are analysed to develop an algorithm to identify test-on-test score improvement of every behaviour compared to less positively ranked behaviour. The results verify and quantify in a data-driven manner a fact long thought to be true that a student’s attitude helps determine their progress and can help even a below-average student achieve great results. Based on the thresholds and classification of students’ test session behaviour, we can quantify how fast a behaviour improves on average compared to less positive behaviour and nudge students towards the appropriate behaviour, and intimate students about the progress rate they can achieve with the improved behaviour, thus improving learning outcomes.
Different Sincerity Score Behaviours With Their Metadata
Sincerity Score | Meaning | Rank/weight(1->best,10->worst) | Attribute |
---|---|---|---|
In Control | The child is putting in the needed effort and often succeeding | 1 | Positive |
Marathoner | The stamina of the average session duration is high | 2 | Positive |
Trying Hard | The stamina of the average session duration is very low | 3 | Positive |
Getting There | The stamina of the average session duration is average | 4 | Positive |
Slow | The child can succeed with much effort | 5 | Neutral |
Train Harder | The child is not putting enough effort to succeed very often | 6 | Negative |
Overconfident | The child is dominantly overconfident and applying themself without putting enough effort | 7 | Negative |
Low Confidence | The child is not confident enough to apply themself | 8 | Negative |
Jumping Around | The stamina of the average session duration very low | 9 | Negative |
Careless( Lack of interest, lack of focus, lack of concentration ) | The child is dominantly underapplying themself to the material at hand and losing marks as a result. | 10 | Negative |
Algorithm
Inputs:
N: Total number of valid test sessions with positive score improvement
p: Number of Sincerity Score behaviours.
Outputs:
For every Sincerity Score behaviour (Ranks 1-9), on average, how much faster the test-on-test improvement is observed for all less positive Sincerity Score behaviour?
Glossary:
- Pre-Test: Between any two successive tests (sorted by timestamp) given by a user on a specific exam, the first test amongst the two is called a pre-test.
- Post-Test: Between any two successive tests (sorted by timestamp) given by a user on a specific exam, the second test amongst the two is called a post-test.
- Valid Test Session: A test session is considered valid if a student:
- spends >= 10% of the time duration allocated to the test
- answers >= 10% of total questions in the test
- scores mark >= 0 in the test
- Less Positive Sincerity Score Behaviour: A Sincerity Score behaviour is considered less Positive than a second Sincerity Score behaviour if its Rank Value, based on Table 1, is greater than the second. Similarly, a Sincerity Score behaviour is considered a better or more positive Sincerity Score behaviour than a second behaviour if it is ranked lower than the second behaviour.
Procedure:
- From all the test sessions of Embibe, only valid test sessions are considered.
Time Complexity for this step = O(N) - Every user’s test session was classified into a Sincerity Score behaviour based on the accuracy, attempt percentage, and time spent in that test session.
Time Complexity for this step = O(N) - The score difference between the pre-test and the post-test test, sorted by timestamp given by the student with the same goal name and exam name, is calculated.
Time Complexity for this step = O(NlogN) - Based on Table 1, Rank is applied to the worst Sincerity Score behaviour shown in the pre-test. A student may be classified into two or more behaviours in any test session.
Time Complexity for this step = O(N) - All test entries wherein the student has exhibited that behaviour in the post-test are picked for every behaviour.
Only those test entries wherein the student’s behaviour improved, as per the ranking given, were taken between the pre and post-test.
We get a weighted mean of the transition of the behaviour on the score improvement.
Weighted_Mean = (( Σ(weight of pre-test Sincerity Score- the weight of post-test Sincerity Score)* score_improvement_achieved_by user )/ size of the group ) * (Normalisation Factor)
Where Normalisation Factor = the number of Sincerity Score/Σ(weights allocated to each Sincerity Score) = 10/55
Since we have given unequal weights to each rank, we must include a Normalisation Factor.
If we had given equal weightage to each Rank, i.e., if every Rank weighted 1, then the sum of the Rank weights would have been 10.
However, in our case, we are not giving equal weight. We must ensure that In Control has a 10x lighter weight than Careless. This is achieved by giving 10/55 weight to “In Control” and 10 x 10/55 to “Careless.” Similarly, every Behaviour is given a weight of Rank * (10/55).
Observe that Σ(Rank * 10/55) for Rank ∈ Z and Rank ∈ [1,10] assigns 10, which would have been obtained without the weight parameter.
Time Complexity for this step = O(pN) - Scale down the values by a factor of 1.63.
Scale Down Value = ( maximum transition possible between ranks ) * Normalisation factor
= (10 – 1) *(10/55) = 9 * 10/55
= 1.63
Time Complexity for this step = O(p)
Flow Chart
Observations
Behaviour | Average Improvement Ratio | SSS String |
---|---|---|
In Control | 6.59 | In Control typically improves 6.6x faster than students exhibiting lower behaviours. |
Marathoner | 7.14 | Marathoner typically improves 7.1x faster than students exhibiting lower behaviours. |
Trying Hard | 8.49 | Trying Hard typically improves 8.5x faster than students exhibiting lower behaviours. |
Getting There | 6.19 | Getting There typically improves s 6.2x faster than students exhibiting lower behaviours. |
Slow | 4.28 | Slow typically improves 4.3x faster than students exhibiting lower behaviours. |
Train Harder | 3.85 | Train Harder typically improves 3.8x faster than students exhibiting lower behaviours. |
Overconfident | 5.51 | Overconfident typically improves 5.5x faster than students exhibiting lower behaviours. |
Low Confidence | 1.05 | Low Confidence typically improves 1.0x faster than students exhibiting lower behaviours. |
Jumping Around | 0.69 | Jumping Around typically improves 0.7x faster than students exhibiting lower behaviours. |
Amongst the 4 positive behaviours, it is observed that most students’ test sessions tend to show Marathoner (35.4%) and Getting There (30%) behaviours. Amongst the negative behaviours, students tend towards Jumping Around (33%) and Train Harder (50%).
Future Work
- Application of unsupervised learning to identify the behaviour of the students. This will enable the system to eliminate bias regarding time spent, accuracy, and attempt percentage.
- The normalization of the score improvement using learned objective functions via neural networks and other deep learning-based methods.
- Prediction of the futuristic success story of the user on a particular test type and analyze the performance of our model alongside Embibe Score Quotient.
- Recommendation Engine to serve the best content and tests based on personalised Success Stories.