For extra points, I took the task of testing AGREP with different length texts (T) searching for a certain pattern (P).
For this I used the same function that generates a certain length T, and then searched for a pattern in there. Initially this was very random, because with lengths of 100-1000 or so, the time variation was pretty much random depending on how busy the CPU was. This way a word with length 100 could had been searched slower than a word with length 1000.
To fix this, the T samples were generated from 1000 to 100000 characters each. Also, because 1 character or 20 of difference still doesn't make a difference, the samples are generated jumping 100 characters one T to the next. So, we generate T's with 1000, 1100, 1200, 1300, etc. length.
This way we can plot the T's length along with the time it took to search for P in there. The result of plotting that data is:
As we can see with a 80, 000 character T, the times are still pretty good, aproximating 0.06 seconds in the search of a pattern.
Now, comparing with the Knutt-Morris-Pratt and the Boyer Moore algorithms:
KMP behaves the worst of the three, with BM placing second.
In conclusion, AGREP is a pretty good string searching tool, which has great computing times, compared to my other implemented string searching algorithms.
Code used to generate the computing times:
Code used to graph the data (adjusting sizes and stuff for each algorithm):
Very good. 4 extra points.
ResponderEliminar