Create a python script to merge feature weights extracted from yasmet into a single file
completed by: Stan K.
mentors: Kevin Brubeck Unhammer, Francis Tyers
Create a python script to merge feature weights extracted from yasmet into a single file.
Input #1:
$ cat /tmp/newtest.weights
@@@CORRECTIVE-FEATURE@@@ 1
1:0 1.41017
2:0 0.902819
3:0 1.39106
4:0 1.41017
5:0 1.30471
6:0 1.44334
7:0 1.16199
8:0 1.41017
9:0 1.26939
10:0 1.41017
11:0 1.38517
1:1 1
2:1 1.35181
3:1 1
4:1 1
5:1 0.782973
6:1 1
7:1 0.983289
8:1 1
9:1 0.892704
10:1 1
11:1 1
1:2 1
2:2 1
3:2 1
4:2 1
5:2 1
6:2 1
7:2 0.59591
8:2 1
9:2 1
10:2 1
11:2 1
12:0 1.45359
13:0 1.24147
14:0 1.45359
15:0 1.45359
16:0 1.45359
12:1 1
13:1 1
14:1 1
15:1 1
16:1 1
12:2 1
13:2 1.24008
14:2 1
15:2 1
16:2 1
17:0 1.37353
18:0 1.37353
19:0 1.37353
20:0 1.24423
21:0 1.37353
22:0 1.20501
23:0 1.37353
17:1 1
18:1 1
19:1 1
20:1 0.984989
21:1 1
22:1 1.06498
23:1 1
17:2 1
18:2 1
19:2 1
20:2 1
21:2 1
22:2 1
23:2 1
24:0 1.51865
25:0 1.51865
26:0 1.51865
27:0 1.51865
28:0 1.15947
29:0 1.51865
24:1 1
25:1 1
26:1 1
27:1 1
28:1 1.18888
29:1 1
24:2 1
25:2 1
26:2 1
27:2 1
28:2 1
29:2 1
30:0 1
31:0 1
32:0 1
33:0 1
34:0 1
35:0 1
30:1 1.54338
31:1 1.52086
32:1 1.54338
33:1 1.28253
34:1 1.54338
35:1 1.54338
30:2 1
31:2 1
32:2 1
33:2 1.2653
34:2 1
35:2 1
36:0 1
37:0 1
38:0 1
39:0 1
36:1 1
37:1 1
38:1 1
39:1 1
36:2 1.93467
37:2 1.93467
38:2 1.93467
39:2 1.93467
40:0 1
41:0 1
42:0 1
43:0 1
44:0 1
45:0 1
40:1 1.73901
41:1 1.73901
42:1 1.73901
43:1 1.73901
44:1 1.73901
45:1 1.73901
40:2 1
41:2 1
42:2 1
43:2 1
44:2 1
45:2 1
46:0 1
47:0 1
48:0 1
49:0 1
50:0 1
46:1 1.49893
47:1 1.49893
48:1 1.49893
49:1 1.49893
50:1 1.49893
46:2 1
47:2 1
48:2 1
49:2 1
50:2 1
51:0 1.37244
52:0 1.37244
53:0 1.37244
54:0 1.37244
51:1 1
52:1 1
53:1 1
54:1 1
51:2 1
52:2 1
53:2 1
54:2 1
55:0 1.46754
56:0 1.46754
57:0 1.46754
58:0 1.46754
59:0 1.46754
55:1 1
56:1 1
57:1 1
58:1 1
59:1 1
55:2 1
56:2 1
57:2 1
58:2 1
59:2 1
Input #2:
$ cat /tmp/tokamak1.that.input.features.txt
1 (-2, "lem", "field")
2 (-2, "pos", "<n>")
3 (-2, "nbr", "<sg>")
4 (-1, "lem", "line")
5 (-1, "pos", "<n>")
6 (-1, "nbr", "<pl>")
7 (0, "lem", "that")
8 (1, "lem", "move")
9 (1, "pos", "<vblex>")
10 (2, "lem", "around")
11 (2, "pos", "<pr>")
12 (-2, "lem", "by")
13 (-2, "pos", "<pr>")
14 (-1, "lem", "electromagnet")
15 (1, "lem", "surround")
16 (2, "lem", "the")
17 (-2, "lem", "electric")
18 (-2, "pos", "<adj>")
19 (-1, "lem", "current")
20 (-1, "nbr", "<sg>")
21 (1, "lem", "flow")
22 (1, "nbr", "<sg>")
23 (2, "lem", "inside")
24 (-2, "lem", "in")
25 (-1, "lem", "Geneva")
26 (1, "lem", "program")
27 (1, "pos", "<n>")
28 (1, "nbr", "<pl>")
29 (2, "lem", "be")
30 (-2, "lem", "scientist")
31 (-2, "nbr", "<pl>")
32 (-1, "lem", "announce")
33 (-1, "pos", "<vblex>")
34 (1, "lem", "prpers")
35 (2, "lem", "have")
36 (-2, "lem", "from")
37 (-1, "lem", "reach")
38 (0, "nbr", "<sg>")
39 (2, "lem", ";")
40 (-2, "lem", "the")
41 (-1, "lem", "fact")
42 (1, "lem", "charge")
43 (2, "lem", "particle")
44 (2, "pos", "<n>")
45 (2, "nbr", "<pl>")
46 (-2, "lem", "researcher")
47 (-1, "lem", "discover")
48 (1, "lem", "a")
49 (2, "lem", "simple")
50 (2, "pos", "<adj>")
51 (-2, "lem", "a")
52 (-1, "lem", "force")
53 (1, "lem", "tend")
54 (2, "lem", "to")
55 (-1, "lem", "tube")
56 (1, "lem", "simply")
57 (1, "pos", "<adv>")
58 (2, "lem", "encircle")
59 (2, "pos", "<vblex>")
60 (-2, "lem", "line")
61 (-1, "lem", "be")
62 (1, "lem", "the")
63 (2, "lem", "electric")
64 (1, "lem", "hold")
65 (-2, "lem", "be")
66 (-1, "lem", "expect")
67 (2, "lem", "occurrence")
68 (2, "nbr", "<sg>")
69 (-2, "lem", "induce")
70 (-2, "pos", "<vblex>")
71 (1, "lem", "heat")
72 (-2, "lem", "of")
73 (-1, "lem", "heat")
74 (1, "lem", "occur")
75 (2, "lem", "in")
76 (-2, "lem", "prpers")
77 (-1, "lem", "appear")
78 (2, "lem", "maximum")
Input #3:
3
0 # 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0 9:0 10:0 11:0 # 1:1 2:1 3:1 4:1 5:1 6:1 7:1 8:1 9:1 10:1 11:1 # 1:2 2:2 3:2 4:2 5:2 6:2 7:2 8:2 9:2 10:2 11:2 #
0 # 12:0 13:0 14:0 5:0 6:0 7:0 15:0 9:0 16:0 # 12:1 13:1 14:1 5:1 6:1 7:1 15:1 9:1 16:1 # 12:2 13:2 14:2 5:2 6:2 7:2 15:2 9:2 16:2 #
0 # 17:0 18:0 19:0 5:0 20:0 7:0 21:0 9:0 22:0 23:0 11:0 # 17:1 18:1 19:1 5:1 20:1 7:1 21:1 9:1 22:1 23:1 11:1 # 17:2 18:2 19:2 5:2 20:2 7:2 21:2 9:2 22:2 23:2 11:2 #
0 # 24:0 13:0 25:0 20:0 7:0 26:0 27:0 28:0 29:0 # 24:1 13:1 25:1 20:1 7:1 26:1 27:1 28:1 29:1 # 24:2 13:2 25:2 20:2 7:2 26:2 27:2 28:2 29:2 #
1 # 30:0 2:0 31:0 32:0 33:0 7:0 34:0 28:0 35:0 # 30:1 2:1 31:1 32:1 33:1 7:1 34:1 28:1 35:1 # 30:2 2:2 31:2 32:2 33:2 7:2 34:2 28:2 35:2 #
2 # 36:0 13:0 37:0 33:0 7:0 38:0 39:0 # 36:1 13:1 37:1 33:1 7:1 38:1 39:1 # 36:2 13:2 37:2 33:2 7:2 38:2 39:2 #
1 # 40:0 41:0 5:0 20:0 7:0 42:0 9:0 43:0 44:0 45:0 # 40:1 41:1 5:1 20:1 7:1 42:1 9:1 43:1 44:1 45:1 # 40:2 41:2 5:2 20:2 7:2 42:2 9:2 43:2 44:2 45:2 #
1 # 46:0 2:0 31:0 47:0 33:0 7:0 48:0 22:0 49:0 50:0 # 46:1 2:1 31:1 47:1 33:1 7:1 48:1 22:1 49:1 50:1 # 46:2 2:2 31:2 47:2 33:2 7:2 48:2 22:2 49:2 50:2 #
0 # 51:0 3:0 52:0 5:0 20:0 7:0 53:0 9:0 22:0 54:0 11:0 # 51:1 3:1 52:1 5:1 20:1 7:1 53:1 9:1 22:1 54:1 11:1 # 51:2 3:2 52:2 5:2 20:2 7:2 53:2 9:2 22:2 54:2 11:2 #
0 # 55:0 5:0 6:0 7:0 56:0 57:0 58:0 59:0 # 55:1 5:1 6:1 7:1 56:1 57:1 58:1 59:1 # 55:2 5:2 6:2 7:2 56:2 57:2 58:2 59:2 #
Output (example):
1 that<rel><an><mf><sp> 1.41017 (-2, "lem", "field")
2 that<rel><an><mf><sp> 0.902819 (-2, "pos", "<n>")
3 that<rel><an><mf><sp> 1.39106 (-2, "nbr", "<sg>")
4 that<rel><an><mf><sp> 1.41017 (-1, "lem", "line")
5 that<rel><an><mf><sp> 1.30471 (-1, "pos", "<n>")
6 that<rel><an><mf><sp> 1.44334 (-1, "nbr", "<pl>")
7 that<rel><an><mf><sp> 1.16199 (0, "lem", "that")
8 that<rel><an><mf><sp> 1.41017 (1, "lem", "move")
9 that<rel><an><mf><sp> 1.26939 (1, "pos", "<vblex>")
10 that<rel><an><mf><sp> 1.41017 (2, "lem", "around")
11 that<rel><an><mf><sp> 1.38517 (2, "pos", "<pr>")
1 that<cnjsub> 1 (-2, "lem", "field")
2 that<cnjsub> 1.35181 (-2, "pos", "<n>")
3 that<cnjsub> 1 (-2, "nbr", "<sg>")
4 that<cnjsub> 1 (-1, "lem", "line")
5 that<cnjsub> 0.782973 (-1, "pos", "<n>")
6 that<cnjsub> 1 (-1, "nbr", "<pl>")
7 that<cnjsub> 0.983289 (0, "lem", "that")
8 that<cnjsub> 1 (1, "lem", "move")
9 that<cnjsub> 0.892704 (1, "pos", "<vblex>")
10 that<cnjsub> 1 (2, "lem", "around")
11 that<cnjsub> 1 (2, "pos", "<pr>")
1 that<det><dem><sg> 1 (-2, "lem", "field")
2 that<det><dem><sg> 1 (-2, "pos", "<n>")
3 that<det><dem><sg> 1 (-2, "nbr", "<sg>")
4 that<det><dem><sg> 1 (-1, "lem", "line")
5 that<det><dem><sg> 1 (-1, "pos", "<n>")
6 that<det><dem><sg> 1 (-1, "nbr", "<pl>")
7 that<det><dem><sg> 0.59591 (0, "lem", "that")
8 that<det><dem><sg> 1 (1, "lem", "move")
9 that<det><dem><sg> 1 (1, "pos", "<vblex>")
10 that<det><dem><sg> 1 (2, "lem", "around")
11 that<det><dem><sg> 1 (2, "pos", "<pr>")