fpc/utils/sim_pasc/sim.html

117 lines
3.3 KiB
HTML

<HTML>
<!-- $Id: sim.html,v 1.7 2007/08/27 09:57:35 dick Exp $ -->
<HEAD>
<TITLE>The software and text similarity tester SIM</TITLE>
</HEAD>
<BODY>
<H1>The software and text similarity tester SIM</H1>
<H2>
<A HREF="http://www.cs.vu.nl/~dick/">Dick Grune</A>
</H2>
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/README.1st">SIM</A>
tests lexical similarity in texts in C, Java, Pascal, Modula-2, Lisp, Miranda,
and natural language.
It is used
<UL>
<LI>
to detect potentially duplicated code fragments in large software
projects, in program text, in shell scripts and in documentation
</LI>
<LI>
to detect plagiarism in software projects, educational and otherwise
</LI>
</UL>
<P>
SIM 2.19 is available as
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/sim_2_19.shar">
C sources</A>
and as
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/sim_2_19.zip">
MSDOS binaries</A>.
It is also available through ftp; the directory is
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester">
ftp.cs.vu.nl:/pub/dick/similarity_tester</A>.
There is a
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/sim.pdf">
Unix-style manual page</A>.
</P>
<P>
The software similarity tester is very efficient and allows us to compare
this year's students' work with that collected from many past years (much to
the dismay of some, mostly non-CS, students).
Students are told that their work is going to be compared, but some are
non-believers ...
</P>
<P>
The output of the similarity tester can be processed by a number of shell
scripts by Matty Huntjens
(<A HREF="http://www.cs.vu.nl/~matty/">matty@cs.vu.nl</A>).
These shell scripts take sim output and produce lists of suspect submissions,
histograms and the like.
The present version of these scripts is very much geared to the local
situation at the
<A HREF="http://www.vu.nl/">VU University Amsterdam</A>,
though; they are low on portability.
</P>
<P>
We are not afraid that students would try to tune their work to the
similarity tester.
We reckon if they can do that they can also do the exercise.
</P>
<P>
Since this piece of handicraft does not qualify as research, there are no
international papers on it.
The work was described in Dutch in
Dick Grune,
Matty Huntjens,
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/publications/Het_detecteren_van_kopieen_bij_informatica-practica.ps">
Het detecteren van kopie&euml;n bij informatica-practica</A>,
Informatie,
<STRONG>31</STRONG>,
11,
Nov 1989,
pp. 864-867
(<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/artikel.lit">
lit. ref.</A>)).
An
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/Paper.ps">
English translation
</A>
of the paper is also available.
The ftp directory contains a terse
<A HREF="ftp://ftp.cs.vu.nl/pub/dick/similarity_tester/TechnReport">
technical report</A>
about the internal workings of the program.
</P>
<H5>
<HR>
[<A HREF="CVS.html">Previous</A>]
[<A HREF="mag.html">Next</A>]
[<A HREF="http://www.cs.vu.nl/~dick/dick.html">Personal Page</A>]
[<A HREF="http://www.cs.vu.nl/~dick/">Professional Page</A>]
[<A HREF="http://www.cs.vu.nl/">CS</A>]
[<A HREF="http://www.few.vu.nl/">Faculty</A>]
[<A HREF="http://www.vu.nl/">VU University Amsterdam</A>]
<HR>
</H5>
<ADDRESS>
The software and text similarity tester SIM / Dick Grune /
<A HREF="mailto:dick@cs.vu.nl">dick@cs.vu.nl</A>
</ADDRESS>
</BODY>
</HTML>