Skip to content
Snippets Groups Projects
Commit 18c851b3 authored by Stas Syrota's avatar Stas Syrota
Browse files

Adjusted scripts

parent 00de7bf7
No related branches found
No related tags found
No related merge requests found
Showing
with 0 additions and 8925 deletions
File deleted
2,5,6,7,8
1,2,3,6,7,8
2,4,6,8
3,6,7
2,6,7
2,3,6,7,8
File deleted
File deleted
3.600000 79.000000
1.800000 54.000000
3.333000 74.000000
2.283000 62.000000
4.533000 85.000000
2.883000 55.000000
4.700000 88.000000
3.600000 85.000000
1.950000 51.000000
4.350000 85.000000
1.833000 54.000000
3.917000 84.000000
4.200000 78.000000
1.750000 47.000000
4.700000 83.000000
2.167000 52.000000
1.750000 62.000000
4.800000 84.000000
1.600000 52.000000
4.250000 79.000000
1.800000 51.000000
1.750000 47.000000
3.450000 78.000000
3.067000 69.000000
4.533000 74.000000
3.600000 83.000000
1.967000 55.000000
4.083000 76.000000
3.850000 78.000000
4.433000 79.000000
4.300000 73.000000
4.467000 77.000000
3.367000 66.000000
4.033000 80.000000
3.833000 74.000000
2.017000 52.000000
1.867000 48.000000
4.833000 80.000000
1.833000 59.000000
4.783000 90.000000
4.350000 80.000000
1.883000 58.000000
4.567000 84.000000
1.750000 58.000000
4.533000 73.000000
3.317000 83.000000
3.833000 64.000000
2.100000 53.000000
4.633000 82.000000
2.000000 59.000000
4.800000 75.000000
4.716000 90.000000
1.833000 54.000000
4.833000 80.000000
1.733000 54.000000
4.883000 83.000000
3.717000 71.000000
1.667000 64.000000
4.567000 77.000000
4.317000 81.000000
2.233000 59.000000
4.500000 84.000000
1.750000 48.000000
4.800000 82.000000
1.817000 60.000000
4.400000 92.000000
4.167000 78.000000
4.700000 78.000000
2.067000 65.000000
4.700000 73.000000
4.033000 82.000000
1.967000 56.000000
4.500000 79.000000
4.000000 71.000000
1.983000 62.000000
5.067000 76.000000
2.017000 60.000000
4.567000 78.000000
3.883000 76.000000
3.600000 83.000000
4.133000 75.000000
4.333000 82.000000
4.100000 70.000000
2.633000 65.000000
4.067000 73.000000
4.933000 88.000000
3.950000 76.000000
4.517000 80.000000
2.167000 48.000000
4.000000 86.000000
2.200000 60.000000
4.333000 90.000000
1.867000 50.000000
4.817000 78.000000
1.833000 63.000000
4.300000 72.000000
4.667000 84.000000
3.750000 75.000000
1.867000 51.000000
4.900000 82.000000
2.483000 62.000000
4.367000 88.000000
2.100000 49.000000
4.500000 83.000000
4.050000 81.000000
1.867000 47.000000
4.700000 84.000000
1.783000 52.000000
4.850000 86.000000
3.683000 81.000000
4.733000 75.000000
2.300000 59.000000
4.900000 89.000000
4.417000 79.000000
1.700000 59.000000
4.633000 81.000000
2.317000 50.000000
4.600000 85.000000
1.817000 59.000000
4.417000 87.000000
2.617000 53.000000
4.067000 69.000000
4.250000 77.000000
1.967000 56.000000
4.600000 88.000000
3.767000 81.000000
1.917000 45.000000
4.500000 82.000000
2.267000 55.000000
4.650000 90.000000
1.867000 45.000000
4.167000 83.000000
2.800000 56.000000
4.333000 89.000000
1.833000 46.000000
4.383000 82.000000
1.883000 51.000000
4.933000 86.000000
2.033000 53.000000
3.733000 79.000000
4.233000 81.000000
2.233000 60.000000
4.533000 82.000000
4.817000 77.000000
4.333000 76.000000
1.983000 59.000000
4.633000 80.000000
2.017000 49.000000
5.100000 96.000000
1.800000 53.000000
5.033000 77.000000
4.000000 77.000000
2.400000 65.000000
4.600000 81.000000
3.567000 71.000000
4.000000 70.000000
4.500000 81.000000
4.083000 93.000000
1.800000 53.000000
3.967000 89.000000
2.200000 45.000000
4.150000 86.000000
2.000000 58.000000
3.833000 78.000000
3.500000 66.000000
4.583000 76.000000
2.367000 63.000000
5.000000 88.000000
1.933000 52.000000
4.617000 93.000000
1.917000 49.000000
2.083000 57.000000
4.583000 77.000000
3.333000 68.000000
4.167000 81.000000
4.333000 81.000000
4.500000 73.000000
2.417000 50.000000
4.000000 85.000000
4.167000 74.000000
1.883000 55.000000
4.583000 77.000000
4.250000 83.000000
3.767000 83.000000
2.033000 51.000000
4.433000 78.000000
4.083000 84.000000
1.833000 46.000000
4.417000 83.000000
2.183000 55.000000
4.800000 81.000000
1.833000 57.000000
4.800000 76.000000
4.100000 84.000000
3.966000 77.000000
4.233000 81.000000
3.500000 87.000000
4.366000 77.000000
2.250000 51.000000
4.667000 78.000000
2.100000 60.000000
4.350000 82.000000
4.133000 91.000000
1.867000 53.000000
4.600000 78.000000
1.783000 46.000000
4.367000 77.000000
3.850000 84.000000
1.933000 49.000000
4.500000 83.000000
2.383000 71.000000
4.700000 80.000000
1.867000 49.000000
3.833000 75.000000
3.417000 64.000000
4.233000 76.000000
2.400000 53.000000
4.800000 94.000000
2.000000 55.000000
4.150000 76.000000
1.867000 50.000000
4.267000 82.000000
1.750000 54.000000
4.483000 75.000000
4.000000 78.000000
4.117000 79.000000
4.083000 78.000000
4.267000 78.000000
3.917000 70.000000
4.550000 79.000000
4.083000 70.000000
2.417000 54.000000
4.183000 86.000000
2.217000 50.000000
4.450000 90.000000
1.883000 54.000000
1.850000 54.000000
4.283000 77.000000
3.950000 79.000000
2.333000 64.000000
4.150000 75.000000
2.350000 47.000000
4.933000 86.000000
2.900000 63.000000
4.583000 85.000000
3.833000 82.000000
2.083000 57.000000
4.367000 82.000000
2.133000 67.000000
4.350000 74.000000
2.200000 54.000000
4.450000 83.000000
3.567000 73.000000
4.500000 73.000000
4.150000 88.000000
3.817000 80.000000
3.917000 71.000000
4.450000 83.000000
2.000000 56.000000
4.283000 79.000000
4.767000 78.000000
4.533000 84.000000
1.850000 58.000000
4.250000 83.000000
1.983000 43.000000
2.250000 60.000000
4.750000 75.000000
4.117000 81.000000
2.150000 46.000000
4.417000 90.000000
1.817000 46.000000
4.467000 74.000000
This diff is collapsed.
"Sepal Length","Sepal Width","Petal Length","Petal Width","Type"
5.1,3.5,1.4,0.2,"Iris-setosa"
4.9,3,1.4,0.2,"Iris-setosa"
4.7,3.2,1.3,0.2,"Iris-setosa"
4.6,3.1,1.5,0.2,"Iris-setosa"
5,3.6,1.4,0.2,"Iris-setosa"
5.4,3.9,1.7,0.4,"Iris-setosa"
4.6,3.4,1.4,0.3,"Iris-setosa"
5,3.4,1.5,0.2,"Iris-setosa"
4.4,2.9,1.4,0.2,"Iris-setosa"
4.9,3.1,1.5,0.1,"Iris-setosa"
5.4,3.7,1.5,0.2,"Iris-setosa"
4.8,3.4,1.6,0.2,"Iris-setosa"
4.8,3,1.4,0.1,"Iris-setosa"
4.3,3,1.1,0.1,"Iris-setosa"
5.8,4,1.2,0.2,"Iris-setosa"
5.7,4.4,1.5,0.4,"Iris-setosa"
5.4,3.9,1.3,0.4,"Iris-setosa"
5.1,3.5,1.4,0.3,"Iris-setosa"
5.7,3.8,1.7,0.3,"Iris-setosa"
5.1,3.8,1.5,0.3,"Iris-setosa"
5.4,3.4,1.7,0.2,"Iris-setosa"
5.1,3.7,1.5,0.4,"Iris-setosa"
4.6,3.6,1,0.2,"Iris-setosa"
5.1,3.3,1.7,0.5,"Iris-setosa"
4.8,3.4,1.9,0.2,"Iris-setosa"
5,3,1.6,0.2,"Iris-setosa"
5,3.4,1.6,0.4,"Iris-setosa"
5.2,3.5,1.5,0.2,"Iris-setosa"
5.2,3.4,1.4,0.2,"Iris-setosa"
4.7,3.2,1.6,0.2,"Iris-setosa"
4.8,3.1,1.6,0.2,"Iris-setosa"
5.4,3.4,1.5,0.4,"Iris-setosa"
5.2,4.1,1.5,0.1,"Iris-setosa"
5.5,4.2,1.4,0.2,"Iris-setosa"
4.9,3.1,1.5,0.1,"Iris-setosa"
5,3.2,1.2,0.2,"Iris-setosa"
5.5,3.5,1.3,0.2,"Iris-setosa"
4.9,3.1,1.5,0.1,"Iris-setosa"
4.4,3,1.3,0.2,"Iris-setosa"
5.1,3.4,1.5,0.2,"Iris-setosa"
5,3.5,1.3,0.3,"Iris-setosa"
4.5,2.3,1.3,0.3,"Iris-setosa"
4.4,3.2,1.3,0.2,"Iris-setosa"
5,3.5,1.6,0.6,"Iris-setosa"
5.1,3.8,1.9,0.4,"Iris-setosa"
4.8,3,1.4,0.3,"Iris-setosa"
5.1,3.8,1.6,0.2,"Iris-setosa"
4.6,3.2,1.4,0.2,"Iris-setosa"
5.3,3.7,1.5,0.2,"Iris-setosa"
5,3.3,1.4,0.2,"Iris-setosa"
7,3.2,4.7,1.4,"Iris-versicolor"
6.4,3.2,4.5,1.5,"Iris-versicolor"
6.9,3.1,4.9,1.5,"Iris-versicolor"
5.5,2.3,4,1.3,"Iris-versicolor"
6.5,2.8,4.6,1.5,"Iris-versicolor"
5.7,2.8,4.5,1.3,"Iris-versicolor"
6.3,3.3,4.7,1.6,"Iris-versicolor"
4.9,2.4,3.3,1,"Iris-versicolor"
6.6,2.9,4.6,1.3,"Iris-versicolor"
5.2,2.7,3.9,1.4,"Iris-versicolor"
5,2,3.5,1,"Iris-versicolor"
5.9,3,4.2,1.5,"Iris-versicolor"
6,2.2,4,1,"Iris-versicolor"
6.1,2.9,4.7,1.4,"Iris-versicolor"
5.6,2.9,3.6,1.3,"Iris-versicolor"
6.7,3.1,4.4,1.4,"Iris-versicolor"
5.6,3,4.5,1.5,"Iris-versicolor"
5.8,2.7,4.1,1,"Iris-versicolor"
6.2,2.2,4.5,1.5,"Iris-versicolor"
5.6,2.5,3.9,1.1,"Iris-versicolor"
5.9,3.2,4.8,1.8,"Iris-versicolor"
6.1,2.8,4,1.3,"Iris-versicolor"
6.3,2.5,4.9,1.5,"Iris-versicolor"
6.1,2.8,4.7,1.2,"Iris-versicolor"
6.4,2.9,4.3,1.3,"Iris-versicolor"
6.6,3,4.4,1.4,"Iris-versicolor"
6.8,2.8,4.8,1.4,"Iris-versicolor"
6.7,3,5,1.7,"Iris-versicolor"
6,2.9,4.5,1.5,"Iris-versicolor"
5.7,2.6,3.5,1,"Iris-versicolor"
5.5,2.4,3.8,1.1,"Iris-versicolor"
5.5,2.4,3.7,1,"Iris-versicolor"
5.8,2.7,3.9,1.2,"Iris-versicolor"
6,2.7,5.1,1.6,"Iris-versicolor"
5.4,3,4.5,1.5,"Iris-versicolor"
6,3.4,4.5,1.6,"Iris-versicolor"
6.7,3.1,4.7,1.5,"Iris-versicolor"
6.3,2.3,4.4,1.3,"Iris-versicolor"
5.6,3,4.1,1.3,"Iris-versicolor"
5.5,2.5,4,1.3,"Iris-versicolor"
5.5,2.6,4.4,1.2,"Iris-versicolor"
6.1,3,4.6,1.4,"Iris-versicolor"
5.8,2.6,4,1.2,"Iris-versicolor"
5,2.3,3.3,1,"Iris-versicolor"
5.6,2.7,4.2,1.3,"Iris-versicolor"
5.7,3,4.2,1.2,"Iris-versicolor"
5.7,2.9,4.2,1.3,"Iris-versicolor"
6.2,2.9,4.3,1.3,"Iris-versicolor"
5.1,2.5,3,1.1,"Iris-versicolor"
5.7,2.8,4.1,1.3,"Iris-versicolor"
6.3,3.3,6,2.5,"Iris-virginica"
5.8,2.7,5.1,1.9,"Iris-virginica"
7.1,3,5.9,2.1,"Iris-virginica"
6.3,2.9,5.6,1.8,"Iris-virginica"
6.5,3,5.8,2.2,"Iris-virginica"
7.6,3,6.6,2.1,"Iris-virginica"
4.9,2.5,4.5,1.7,"Iris-virginica"
7.3,2.9,6.3,1.8,"Iris-virginica"
6.7,2.5,5.8,1.8,"Iris-virginica"
7.2,3.6,6.1,2.5,"Iris-virginica"
6.5,3.2,5.1,2,"Iris-virginica"
6.4,2.7,5.3,1.9,"Iris-virginica"
6.8,3,5.5,2.1,"Iris-virginica"
5.7,2.5,5,2,"Iris-virginica"
5.8,2.8,5.1,2.4,"Iris-virginica"
6.4,3.2,5.3,2.3,"Iris-virginica"
6.5,3,5.5,1.8,"Iris-virginica"
7.7,3.8,6.7,2.2,"Iris-virginica"
7.7,2.6,6.9,2.3,"Iris-virginica"
6,2.2,5,1.5,"Iris-virginica"
6.9,3.2,5.7,2.3,"Iris-virginica"
5.6,2.8,4.9,2,"Iris-virginica"
7.7,2.8,6.7,2,"Iris-virginica"
6.3,2.7,4.9,1.8,"Iris-virginica"
6.7,3.3,5.7,2.1,"Iris-virginica"
7.2,3.2,6,1.8,"Iris-virginica"
6.2,2.8,4.8,1.8,"Iris-virginica"
6.1,3,4.9,1.8,"Iris-virginica"
6.4,2.8,5.6,2.1,"Iris-virginica"
7.2,3,5.8,1.6,"Iris-virginica"
7.4,2.8,6.1,1.9,"Iris-virginica"
7.9,3.8,6.4,2,"Iris-virginica"
6.4,2.8,5.6,2.2,"Iris-virginica"
6.3,2.8,5.1,1.5,"Iris-virginica"
6.1,2.6,5.6,1.4,"Iris-virginica"
7.7,3,6.1,2.3,"Iris-virginica"
6.3,3.4,5.6,2.4,"Iris-virginica"
6.4,3.1,5.5,1.8,"Iris-virginica"
6,3,4.8,1.8,"Iris-virginica"
6.9,3.1,5.4,2.1,"Iris-virginica"
6.7,3.1,5.6,2.4,"Iris-virginica"
6.9,3.1,5.1,2.3,"Iris-virginica"
5.8,2.7,5.1,1.9,"Iris-virginica"
6.8,3.2,5.9,2.3,"Iris-virginica"
6.7,3.3,5.7,2.5,"Iris-virginica"
6.7,3,5.2,2.3,"Iris-virginica"
6.3,2.5,5,1.9,"Iris-virginica"
6.5,3,5.2,2,"Iris-virginica"
6.2,3.4,5.4,2.3,"Iris-virginica"
5.9,3,5.1,1.8,"Iris-virginica"
File deleted
File deleted
This diff is collapsed.
***************************************************************************
***************************************************************************
*** messy_data ***
***************************************************************************
***************************************************************************
This dataset is an adaption of an existing dataset to highlight some common
issues (or variants of them) that one might face across various datasets.
This is not real data, but is based on values from the Auto-Mpg Data.
The original data was obtained from:
https://archive.ics.uci.edu/ml/datasets/auto+mpg
but was modified to include some formatting issues as well as removing some
values.
Missing values in the original dataset were sometimes denoted
with a question mark. Some missing values were introduced, too.
Specifically zeroes in the attributes mpg and displacement can be
considered missing values.
For reference, the description of the original dataset is provided below.
***************************************************************************
***************************************************************************
*** Original dataset description ***
***************************************************************************
***************************************************************************
1. Title: Auto-Mpg Data
2. Sources:
(a) Origin: This dataset was taken from the StatLib library which is
maintained at Carnegie Mellon University. The dataset was
used in the 1983 American Statistical Association Exposition.
(c) Date: July 7, 1993
3. Past Usage:
- See 2b (above)
- Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning.
In Proceedings on the Tenth International Conference of Machine
Learning, 236-243, University of Massachusetts, Amherst. Morgan
Kaufmann.
4. Relevant Information:
This dataset is a slightly modified version of the dataset provided in
the StatLib library. In line with the use by Ross Quinlan (1993) in
predicting the attribute "mpg", 8 of the original instances were removed
because they had unknown values for the "mpg" attribute. The original
dataset is available in the file "auto-mpg.data-original".
"The data concerns city-cycle fuel consumption in miles per gallon,
to be predicted in terms of 3 multivalued discrete and 5 continuous
attributes." (Quinlan, 1993)
5. Number of Instances: 398
6. Number of Attributes: 9 including the class attribute
7. Attribute Information:
1. mpg: continuous
2. cylinders: multi-valued discrete
3. displacement: continuous
4. horsepower: continuous
5. weight: continuous
6. acceleration: continuous
7. model year: multi-valued discrete
8. origin: multi-valued discrete
9. car name: string (unique for each instance)
8. Missing Attribute Values: horsepower has 6 missing values
messy_data
mpg cylinders displacement horsepower weight acceleration modelyear origin carname
mpg cyl disp hp w acc yr org name
18 8 ? 130 3'504 12.0 70 1 chevrolet chevelle malibu
15 8 350 165 3'693 11,5 70 1 buick skylark 320
18 8 ? 150 3'436 11.0 70 1 plymouth satellite
16 8 ? 150 3'433 12.0 70 1 amc rebel sst
17 8 0 140 3'449 10,5 70 1 ford torino
15 8 429 198 4'341 10.0 70 1 ford galaxie 500
14 8 454 220 4'354 9.0 70 1 chevrolet impala
14 8 ? 215 4312 8,5 70 1 plymouth fury iii
14 8 455 225 4425 10.0 70 1 pontiac catalina
15 8 390 190 3'850 8,5 70 1 amc ambassador dpl
15 8 0 170 3'563 10.0 70 1 dodge challenger se
14 8 ? 160 3'609 8.0 70 1 plymouth 'cuda 340
99 8 ? 150 3'761 9,5 70 1 chevrolet monte carlo
14 8 ? 225 3'086 10.0 70 1 buick estate wagon (sw)
24 4 113 95 2'372 15.0 70 3 toyota corona mark ii
22 6 95 2'833 15,5 70 1 plymouth duster
0 6 199 97 2'774 15,5 70 1 amc hornet
21 6 ? 85 2'587 16.0 70 1 ford maverick
27 4 97 88 2'130 14,5 70 3 datsun pl510
26 4 46 1'835 20,5 70 2 volkswagen 1131 deluxe sedan
33 4 105 74 2190 14.2 81 2 volkswagen jetta
33.7 4 107 75 2210 14.4 81 3 honda prelude
32.4 4 108 75 2350 16.8 81 3 toyota corolla
32.9 4 119 100 2615 14.8 81 3 datsun 200sx
31.6 4 120 74 2635 18.3 81 3 mazda 626
28.1 4 141 80 3'230 20.4 81 2 peugeot 505s turbo diesel
30.7 6 145 76 3'160 19.6 81 2 volvo diesel
0 6 168 116 2'900 12.6 81 3 toyota cressida
24.2 6 146 120 2'930 13.8 81 3 datsun 810 maxima
File deleted
You may use the lists of names for any purpose, so long as credit is given
in any published work. You may also redistribute the list if you
provide the recipients with a copy of this README file. The lists are
not in the public domain (I retain the copyright on the lists) but are
freely redistributable.
If you have any additions to the lists of names, I would appreciate
receiving them.
My email address is mkant+@cs.cmu.edu.
Mark Kantrowitz
a
about
above
accordingly
across
after
afterwards
again
against
all
allows
almost
alone
along
already
also
although
always
am
among
amongst
an
and
another
any
anybody
anyhow
anyone
anything
anywhere
apart
appear
appropriate
are
around
as
aside
associated
at
available
away
awfully
b
back
be
became
because
become
becomes
becoming
been
before
beforehand
behind
being
below
beside
besides
best
better
between
beyond
both
brief
but
by
c
came
can
cannot
cant
cause
causes
certain
changes
co
come
consequently
contain
containing
contains
corresponding
could
currently
d
day
described
did
different
do
does
doing
done
down
downwards
during
e
each
eg
eight
either
else
elsewhere
enough
et
etc
even
ever
every
everybody
everyone
everything
everywhere
ex
example
except
f
far
few
fifth
first
five
followed
following
for
former
formerly
forth
four
from
further
furthermore
g
get
gets
given
gives
go
gone
good
got
great
h
had
hardly
has
have
having
he
hence
her
here
hereafter
hereby
herein
hereupon
hers
herself
him
himself
his
hither
how
howbeit
however
i
ie
if
ignored
immediate
in
inasmuch
inc
indeed
indicate
indicated
indicates
inner
insofar
instead
into
inward
is
it
its
itself
j
just
k
keep
kept
know
l
last
latter
latterly
least
less
lest
life
like
little
long
ltd
m
made
make
man
many
may
me
meanwhile
men
might
more
moreover
most
mostly
mr
much
must
my
myself
n
name
namely
near
necessary
neither
never
nevertheless
new
next
nine
no
nobody
none
noone
nor
normally
not
nothing
novel
now
nowhere
o
of
off
often
oh
old
on
once
one
ones
only
onto
or
other
others
otherwise
ought
our
ours
ourselves
out
outside
over
overall
own
p
particular
particularly
people
per
perhaps
placed
please
plus
possible
probably
provides
q
que
quite
r
rather
really
relatively
respectively
right
s
said
same
second
secondly
see
seem
seemed
seeming
seems
self
selves
sensible
sent
serious
seven
several
shall
she
should
since
six
so
some
somebody
somehow
someone
something
sometime
sometimes
somewhat
somewhere
specified
specify
specifying
state
still
sub
such
sup
t
take
taken
than
that
the
their
theirs
them
themselves
then
thence
there
thereafter
thereby
therefore
therein
thereupon
these
they
third
this
thorough
thoroughly
those
though
three
through
throughout
thru
thus
time
to
together
too
toward
towards
twice
two
u
under
unless
until
unto
up
upon
us
use
used
useful
uses
using
usually
v
value
various
very
via
viz
vs
w
was
way
we
well
went
were
what
whatever
when
whence
whenever
where
whereafter
whereas
whereby
wherein
whereupon
wherever
whether
which
while
whither
who
whoever
whole
whom
whose
why
will
with
within
without
work
world
would
x
y
year
years
yet
you
your
yours
yourself
yourselves
z
zero
\ No newline at end of file
File deleted
File deleted
File deleted
File deleted
File deleted
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment