aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/README.md
blob: 982a108649850529e36051fee3aeb490ddc18819 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
Lua script for checking the health of Devuan Linux package mirrors.

As the name would suggest, the ultimate goal of apt-panopticon is to
probe into every nook and cranny of Devuans apt distribution system, to
find out what is breaking, and where.  Graphing history and sending
alerts as needed.  At some later stage it's likely to be generalised for
other apt based distros.

This is currently under development, not everything has been written yet. 
Some of this document mentions some of the things that are not written
yet.

apt-panopticon is a Lua script used by the Devuan mirror admins (maybe,
if they like it) to check the health of Devuan Linux package mirrors. 
Originally there was bash scripts for this job, then Evilham wrote some
Python scripts, now onefang has written it in Lua.  We all have different
tastes in languages.  lol

The main difference is that this Lua version tries to do everything, and
will be maintained.  Currently the shell scripts and Python scripts are
actually being used I think.  Evilham asked me to write this, after I
badgered him about his Python scripts.  It should also be much easier to
use, the previous scripts needed some work before you could run them,
this one you just download and run.

The source code is at [https://sledjhamr.org/cgit/apt-panopticon/](https://sledjhamr.org/cgit/apt-panopticon/)

The issue tracker is at [https://sledjhamr.org/mantisbt/project_page.php?project_id=13](https://sledjhamr.org/mantisbt/project_page.php?project_id=13)


Installation.
-------------

Download the source.  You may want to put the apt-panopticon.lua script
in someplace like `/usr/local/bin` and make sure it is executable.

It should run on any recent Linux, you'll need to have the following
installed -

* curl
* dig, part of BIND.  On Debian based systems it'll be in the dnsutils package.
* flock,  on Debian based systems it'll be in the util-linux package.
* gpgv
* gzip
* ionice, on Debian based systems it'll be in the util-linux package.
* luajit
* lua-rrd
* LuaSocket, on Debian based systems it'll be in the lua-socket package.
* md5sum and sha256, on Debian based systems they'll be in the coreutils package.
* rrdtool
* xz, on Debian based systems it'll be in the xz-utils package.

If you want to have lots of graphs, also install
[https://sledjhamr.org/cgit/apt-panopticon_cgp/](https://sledjhamr.org/cgit/apt-panopticon_cgp/).

For the apt-panopticon_cgp package, which is used to show the detailed
graphs, you'll need a web server that supports PHP.  apt-panopticon_cgp
includes some support files for running PHP via CGI, which more web
servers support.  You'll need php-cgi for that.


Web installation.
-----------------

This is a suggestion for installation on a Devuan based web server. 

Create -

/var/www/html/apt-panopticon

Install apt-panopticon and apt-panopticon_cgp there, so you end up with -

/var/www/html/apt-panopticon/apt-panopticon
/var/www/html/apt-panopticon/apt-panopticon_cgp

The script update_apt-panopticon is an example script for updating
everything, including commented out commands to update the source code. 
The file apt-panopticron is an example crontab file for updating
everything once every ten minutes.  They assume your web server user is
www-data with a group of www-data, and you have a mirror user called
mirrors.  For mirror operators, that mirrors user would be the owner of
the mirror files.  You can change these to suite yourself.

Once everything is updated,
/var/www/html/apt-panopticon/results/Report-web.html
will point to the main web page, and there will be a link at the bottom of
that pointing to the detailed graphs.

Note that two runs of apt-panopticon have to happen ten minutes apart at
least in order to see any data on the graphs.

If you had already been running apt-panopticon for a while and have lots
of data collected, the apt-panopticon-update-data.lua script can go
through all of that and feed it to RRD / update the files.

Using it.
---------

These examples assume you are running it from the source code directory. 
A directory will be created called `results`, it'll be full of log files
and any files that get downloaded.  There will also be `results/email`
and `results/web` directories, with the notification emails and web pages
(once I write that bit).

Note that unlike typical commands, you can't run single character options
together, so this is wrong -

    $ ./apt-panopticon.lua -vvv

Instead do this -

    $ ./apt-panopticon.lua -v -v -v

Just run the script to do all of the tests -

    $ ./apt-panopticon.lua

Which will print any errors.  If you don't want to see errors -

    $ ./apt-panopticon.lua -q

If you want to see warnings as well (as usual, the more `-v` options, the more
details) -

    $ ./apt-panopticon.lua -v

Or use the usual options for the help and version number (not written yet) -

    $ ./apt-panopticon.lua -h
    $ ./apt-panopticon.lua --help
    $ ./apt-panopticon.lua --version

To run the tests on a specific mirror, for example pkgmaster.devuan.org -

    $ ./apt-panopticon.lua pkgmaster.devuan.org

You can use the `--tests` option to tune which tests are run, for example
to stop IPv6 tests, coz you don't have IPv6 -

    $ ./apt-panopticon.lua --tests=-IPv6

To do the same, but not run the HTTPS tests either -

    $ ./apt-panopticon.lua --tests=-IPv6,-https

To only run the HTTP integrity tests, only on IPv6 -

    $ ./apt-panopticon.lua --tests=http,Integrity,IPv6


The tests.
----------

The basic test is to find all the IPs for a mirror, including any CNAMES,
then send HTTP HEAD requests to those IPs, with HOST headers for that
mirror, and follow any redirections, doing the same for those
redirections.  Unless a specific mirror is given on the command line, the
mirror_list.txt file from pkgmaster.devuan.org is used to select mirrors
to test.

explanations.html explains the tests in more detail.

The --tests= option can be used to adjust the list of tests performed.

* IPv4, perform the tests with IPv4 addresses (A records)
* IPv6, perform the tests with IPv6 addresses (AAAA records)
* ftp, test FTP protocol access, check for the existence of the file instead of a HTTP HEAD.
* http, test HTTP protocol access.
* https, test HTTPS protocol access.
* rsync, test RSYNC protocol access.
* DNS-RR, Checks if the IPs for this mirror are part of the DNS-RR.
* Protocol, warn if the protocol changed during a redirect.
* Redirects test bad redirects, redirecting /DEVUAN/ to deb.devuan.org.
* URL-Sanity, add gratuitous multiple slashes to the URLs.
* Integrity, check hashes and PGP signatures.
* Updated, check Release dates and updated packages.

Note that apt-panopticon will detect if you have no IPv6 conectivity, and
disable IPv6 tests automatically.

The old tests include a "DNS-RR" test, I'm not sure what that is.  I
guess it checks if the mirror responds properly if it's accessed via it's
DNS RR (round robin) IP, and a HOST header of deb.devuan.org.  If no
other mirror is listed on the command line, we start with deb.devuan.org
and check all of it's IPs, which are the DNS RR mirrors anyway.

The mirror_list.txt file also used to select which protocols to test for
each mirror, it will only test those protocols the mirror lists as
supporting.


Options.
--------

--help

Print the help text.

--version

Print the version.

-v

Print more verbose output.  Normally only CRITICAL and ERROR message sare
printed.  -v will print WARNING messages as well, -v -v INFO messages,
and -v -v -v DEBUG messages.  All messages are logged regardless.

-q

Only print CRITICAL messages.

-k

Keep any results from the previous runs, instead of deleting them before
running the tests.  This may or may not be obsolete.

-4 and -6

Used internally to pass around flags to make sure curl only tries to use
IPv4 or IPv6 as appropriate.

-o and -r

Used internally to keep track of -origin server and -redirect server.

--bandwidth, --low, --medium, --high, --more, and --all

Enable and disable tests that use more or less bandwidth.  --bandwidth=x
where x is a digit between 0 and 4, and the other options are aliases for
those numbers.  So --bandwidth=0 is the same as --low.  --bandwidth=2 or
--high is the default.

--cgi

Use the php.cgi versions of the URLs for the graphs.

--maxtime, --retries, --timeout, and --timeouts

Various paramaters to determine how long to wait for downloads to
complete, how many retries to attempt for downloads that fail, how long
to wait for a connection, and how many timeouts to tolerate.

--referenceSite

The mirror to use as a reference for the tests, the default is
pkgmaster.devuan.org.

--roundRobin

The name of the DNS round robin domain, the default is deb.devuan.org.

--reports

Select which reports to generate.  The arguments are comma separated.  A
negative argument deselects a report.

--tests

Select which tests to run.  The arguments are comma separated.  A
negative argument deselects a test.  Examples are given above.


Theory of operation.
--------------------

Typically you would call it without any specific mirror mentioned on the
command line.  I'll start the discussion from there.

Create the results directory.

If -k is not given, delete results/*.log.

Delete results/*.check.

touch results/stamp

Open results/apt-panopticon.log for message logging.

Download mirror_list.txt from the reference site.  Build a table of
Active mirrors keyed by the FDQN, include the listed Protocols as a sub
table.  Add the round robin domain name.  Resolve all the IPs and add
them to this table.  Write this table to results/mirrors.lua so that the
forked tests can read it.

checkHost() the reference site first.

Loop through the mirrors table, and checkHost() each one, skipping the
reference site.

Wait for all forked tests to finish.

Delete results/*.check.


The checkHost() function does this -

If there is no second argument, then the host is set to the first
argument, otherwise the host is the second argument.

Gather the IPs for the host name with the following command -

dig +keepopen +noall +nottlid +answer example.com A example.com AAAA
example.com CNAME example.com SRV | sort -r | uniq

So it should end up with all the IPV4, IPV6, CNAME, and SRV records for
that host.

For each IPv4 and IPv6 address, fork a copy of the script something like
this (including any arguments originally provided to the script) -

ionice -c3 ./apt-panopticon.lua example.com/path x.x.x.x &

ionice -c3 ./apt-panopticon.lua example.com/path [x:x:x:x:x:x] &

For each CNAME, it checkHost() the host, but with the CNAME as a second
argument.

SRV reconds don't do anything yet, coz I have yet to see one from my test
environment, so can't test it.


Each forked call of the script from above does this -

Open results/example.com_x.x.x.x.log for message logging.

Loads the mirrors table from results/mirrors.lua.

If performing the Integrity or Updated testes, delete results/example.com
directory, downloads the reference files using wget.  While it should
actually perform the Integrity and Updated tests now, those haven't been
written yet.  Note that currently this downloads 4GB per mirror.

Calls checkHost() with the host as first and second arguments, and
includes the IP this time.  The inclusion of the IP causes checkHost() to
call checkFiles().


checkFiles() will call checkHEAD() for each of the reference files.


checkHEAD() uses LuaSocket (or LuaSec for HTTPS) to send a HEAD request
to the IP, with a Host header set to the original host name.  Redirects
will not be followed by that request.  If the request returns a redirect,
then checkHEAD() is called recursively.  If the redirect is to some host
we are not already checking, we call checkHost() on it, with an IP of
"redir".  This causes checkHost() to bypass the test that would otherwise
call checkFiles(), instead gathering the IPs and fork as usual.