annotate test_jsscan.py @ 5:815520476fbb default tip

accept '/' as a literal
author Atul Varma <avarma@mozilla.com>
date Thu, 22 Apr 2010 20:03:31 -0700
parents 30c1f55eff96
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
1 """
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
2 C-style comments:
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
3
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
4 >>> tokenize('/* hello */')
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
5 ('c_comment', '/* hello */', (1, 0))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
6
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
7 C++-style comments:
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
8
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
9 >>> tokenize('// hello')
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
10 ('cpp_comment', '// hello', (1, 0))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
11
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
12 Variable definitions:
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
13
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
14 >>> tokenize(' var k = 1;')
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
15 ('name', 'var', (1, 2))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
16 ('name', 'k', (1, 6))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
17 ('whitespace', ' ', (1, 7))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
18 ('literal', '=', (1, 8))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
19 ('whitespace', ' ', (1, 9))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
20 ('digits', '1', (1, 10))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
21 ('literal', ';', (1, 11))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
22
2
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
23 Filtering:
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
24
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
25 >>> tokenize(' k', ignore='whitespace')
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
26 ('name', 'k', (1, 2))
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
27
4
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
28 Many double-quoted strings on the same line:
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
29
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
30 >>> tokenize(r'"hello there "+" dude"')
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
31 ('string', '"hello there "', (1, 0))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
32 ('literal', '+', (1, 14))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
33 ('string', '" dude"', (1, 15))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
34
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
35 Many single-quoted strings on the same line:
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
36
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
37 >>> tokenize(r"'hello there '+' dude'")
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
38 ('string', "'hello there '", (1, 0))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
39 ('literal', '+', (1, 14))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
40 ('string', "' dude'", (1, 15))
30c1f55eff96 fixed greedy regexp bug
Atul Varma <avarma@mozilla.com>
parents: 2
diff changeset
41
0
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
42 Escaped double-quoted strings:
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
43
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
44 >>> tokenize(r'"i say \\"tomato\\""')
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
45 ('string', '"i say \\\\"tomato\\\\""', (1, 0))
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
46
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
47 Unterminated double-quoted strings:
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
48
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
49 >>> tokenize(r'"i say \\"tomato\\"')
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
50 Traceback (most recent call last):
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
51 ...
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
52 TokenizationError: unrecognized token '"' @ line 1, char 0
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
53 """
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
54
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
55 import doctest
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
56
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
57 from jsscan import *
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
58
2
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
59 def tokenize(string, ignore=None):
f82ff2c61c06 added ignore kwarg
Atul Varma <avarma@mozilla.com>
parents: 0
diff changeset
60 for token in Tokenizer(string).tokenize(ignore=ignore):
0
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
61 print token
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
62
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
63 if __name__ == '__main__':
daa1c6d996f3 Origination.
Atul Varma <avarma@mozilla.com>
parents:
diff changeset
64 doctest.testmod(verbose=True)