fpc/utils/sim_pasc
2007-11-18 19:33:00 +00:00
..
add_run.c
add_run.h
aiso.bdy
aiso.spc
algollike.c
algollike.h
Answers * source code similarity tester (import of original 2.21 sources available 2007-11-18 18:43:44 +00:00
ChangeLog
clang.l
compare.c
compare.h
debug.par
error.c
error.h
hash.c
hash.h
HOWTO-FPC.txt
idf.c
idf.h
javalang.l
lang.h
language.h
lex.c
lex.h
LICENSE.txt
lisplang.l
m2lang.l
Makefile
miralang.l
options.c
options.h
pascallang.l
pass1.c
pass1.h
pass2.c
pass2.h
pass3.c
pass3.h
percentages.c
percentages.h
READ_ME
READ.ME
README.1st
runs.c
runs.h
settings.par
sim.1
sim.c
sim.h
sim.html * source code similarity tester (import of original 2.21 sources available 2007-11-18 18:43:44 +00:00
sim.txt
sortlist.bdy
sortlist.spc
stream.c
stream.h
sysidf.mk
sysidf.msdos
sysidf.unix
system.par
TechnReport * source code similarity tester (import of original 2.21 sources available 2007-11-18 18:43:44 +00:00
text.c
text.h
textlang.l
token.c
token.h
tokenarray.c
tokenarray.h

This is SIM, Software and text similarity tester, most recent revision
                                                               (2.19, 20050220)
by Dick Grune, Vrije Universiteit, Amsterdam, the Netherlands (dick@cs.vu.nl).

SIM tests lexical similarity in texts in C, Java, Pascal, Modula-2, Lisp,
Miranda and natural language. It can be used

- to detect potentially duplicated code fragments in large software projects,
- to detect plagiarism in software and text-based projects, educational and
  otherwise.

The program is fast:
the UNIX version on a Sun ULTRA does about 50000 tokens/sec,
the DOS version on a Pentium 166 does about 25000 tokens/sec.

SIM is available for UNIX (in source code) and MSDOS (32-bit executables).

UNIX:
	To obtain the files, do:
		sh sim_2_21.shar
	This unpacks the sources, the Makefile, sim.1 and READ_ME.
	For installation notes and other info then see READ_ME.

MSDOS:
	To obtain the files, do:
		[pk]unzip SIM_2_21.zip
	This unpacks the executables, SIM.DOC and READ.ME.
	For other info then see READ.ME.

Changes from Release 2.19:
	Various changes necessitated by Linux flex being different

Changes from Release 2.16:
	Various updates and adjustments in the code and the installation
	procedure.

Changes from Release 2.13:
	Percentage reporting feature added.

Changes from Release 2.12:
	Miranda checker added.

Changes from Release 2.9:
	Java checker added.
	The C checker 'sim' was renamed to 'sim_c', for uniformity.
	Converted the sources to ANSI C.
	All versions now report non_ASCI characters in the input.

Changes from Release 2.8:
	DOS versions can now compare very large files (>400000 tokens)

Changes from Release 1.21, as posted in comp.sources.unix (1987):
	Ported to MSDOS
	Significant speed improvements
	New options: -e, -S and / , to compare files group-wise
	New option: -F , to require function names to match exactly
	Lisp version added
	Miscellaneous improvements


					Dick Grune
					Vrije Universiteit
					de Boelelaan 1081
					1081 HV  Amsterdam
					the Netherlands
					email: dick@cs.vu.nl
					ftp://ftp.cs.vu.nl/pub/dick
					http://www.cs.vu.nl/~dick