source: icGREP/icgrep-devel/QA/greptest.xml @ 5792

Last change on this file since 5792 was 5792, checked in by cameron, 10 months ago

\N{...} expressions now anchored; name expresions in ranges functional

File size: 30.5 KB
Line 
1
2<greptest>
3<datafile id="simple1">
4A few lines of input
5in this simple test file
6provide fodder for some simple
7regexp tests.
8</datafile>
9
10<datafile id="bounded_charclass">

12=a;
13=bb;
14=ccc;
15=dddd;
16=eeeee;
17=ffffff;
18=ggggggg;
19=hhhhhhhh;
20=iiiiiiiii;
21=jjjjjjjjjj;
22=kkkkkkkkkkk;
23=llllllllllll;
24=mmmmmmmmmmmmm;
25=nnnnnnnnnnnnnn;
26=ooooooooooooooo;
27=pppppppppppppppp;
28=qqqqqqqqqqqqqqqqq;
29=rrrrrrrrrrrrrrrrrr;
30=sssssssssssssssssss;
31=tttttttttttttttttttt;
32=uuuuuuuuuuuuuuuuuuuuu;
33=vvvvvvvvvvvvvvvvvvvvvv;
34=wwwwwwwwwwwwwwwwwwwwwww;
35=xxxxxxxxxxxxxxxxxxxxxxxx;
36=yyyyyyyyyyyyyyyyyyyyyyyyy;
37=zzzzzzzzzzzzzzzzzzzzzzzzzz;
38=0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
39=01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
40=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
41=0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
42=01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
43=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
44=01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
45=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789;
46</datafile>
47
48<datafile id="RangeAltSeqMatchStarKplusWhileNotOptAny">
49Dogbe hat ,/R Cat dt bt bt bt bt bat MzzzzzzzzT MaT MT McT MdT MeT M0T M1T M2T M3T M4T
50Dogbe hit foffasm zza " Dog Cat 1, 4= Dog ['zxcvbnm,./R Dog MT
51Dogbe hot foffasm czzb " MazazazTDogogogogog Cat 1, 4= Dog [;'zxcvbnm,./R Dogtp
52Dogbe foffasm dooooc MazT" Dog Cat 1, 4= Dog [Sqwertyuiopasdfghjkl;'zxcvbnm,./R Dog Cat
53Dogbe foffasm ezzzzzzzzzzzzzzt "tp Dog Cat 12, ktp 4= Dog [jkl;'zxcvbnm,./R Dogtp
54Dogbe foffasm zze " Dog CatMjT , = Dog [;'zxcvbzzznm,./R Dog MazazT cat
55zzcztpDogbe fofasm zazazz4z Doggg Cat 6, azzzzz= Dog [;'zxcvbonm,.R Dog TUT Dog
56Natatatats Nats T M0T ed bazbzczdzt et
57Dfg dc fog Nt ezt
58MazazazazazazazT
59</datafile>
60
61
62<datafile id="StartEndAlt">
63The ever-growing social networks and social media provide invaluable
64sources of information for modeling the behavior of users. High-quality
65user models enable superior services and functions for end users. In this
66talk, I will present several examples of user modeling based on social
67networks and social media. I will first describe our research in modeling
68users' information preferences on Microblogs using a novel user message
69model. I will then discuss our work on extracting users' daily activities,
70such as dining and shopping, that inherently reflect their habits, intents and preferences.
71I explain our novel transfer learning solution via a collaborative boosting
72framework comprising a text-to-activity classifier for socially connected users.
73I will also describe our research on user modeling in multiple, overlapping
74social networks in a 'composite social network' setting. I will show the benefits of
75modeling the dynamics of composite networks, where the evolution processes
76of different networks are jointly considered. Finally, I will explain our
77research on finding social spammers in large social networks.
78</datafile>
79
80<datafile id="special_characters">
81The ] character may appear as the first character inside character class
82expressions such as []>)].
83In this case, the ] character does not terminate the character class, but
84stands for itself.
85Similarly, the - character may appear as the first or last character
86in a character class expression, such as [-] or []-].  Occurring as the
87first or last character in a class means that it is a member of the
88class, instead of being interpreted as a range metacharacter.
89For both ] and -, occurrence as the first character could mean after
90an opening [^ mark for negated character class.   That is [^]] is the
91class that matches everything but ], while [^-] is the class that matches
92anything but -.
93----------
94The above line does not match [^-].
95----------
96]]]]]]]]]]
97^^^^^^^^^^
98</datafile>
99
100<datafile id="ips"> 
101201.250.180.213
102236.4.20.176
103137.96.194.126
104245.16.96.112
105245.19.58.43
106131.176.131.248
107248.160.22.214
108156.179.88.103
109174.13.62.156
110256.122.123.5
11116.81.78.152
112177.17.24.167
11332.120.25.23
114138.82.66.15
1154.196.8.251
116101.30.211.3
117209.44.105.129
11856.166.31.72
119247.108.224.170
120124.248.83.156
121113.107.178.250
122189.243.10.192
123184.18.189.31
12448.145.33.2
125188.137.131.244
12649.161.61.42
12714.31.211.138
12824.39.39.136
129146.217.131.80
130205.141.18.135
131159.207.166.206
13296.211.62.20
13323.148.44.140
134109.159.129.161
135183.230.172.129
13648.178.63.192
137224.41.190.207
138144.114.56.31
139151.205.132.247
140161.194.12.184
14187.55.69.195
142214.198.102.143
143173.19.17.220
144197.80.158.167
145121.94.119.11
146208.174.42.104
147124.173.96.31
148112.107.215.199
149162.30.140.121
150227.241.9.145
1516.26.111.203
152106.14.115.226
153107.233.237.60
154153.24.163.23
155197.4.54.55
156111.14.253.18
15743.138.139.15
158125.148.160.131
159173.16.80.24
16030.194.250.136
161173.233.196.71
162</datafile>
163
164<datafile id="emails">
165danielsmithinvestment01@yahoo.com
166vivian.johnp24@gmail.com
167drjohnsonadamscompany@mail.com
168fb43@kurtz.onmicrosoft.com
169delphinehakizimana11@zipmail.com.br
170mrs.swp@outlook.com
171engr.saidsalem@workmail@co.za
172suleadams342003@gmail.com
173info.soopercredit@qq.com
174aliceisdale@yahoo.com
175elizabethjohnson134@hotmail.com
176anikaebertus@yahoo.se
177bayford_A@qq.biz
178hijabfarid@hotmail.com
179zaringwarkipkalya@aol.fr
180monahmeddd2014@gmail.com
181hijab.farid@hotmail.cam
182dennis.melcher01@gmail.com
183publicitycbn@gmail.com
184michaelkruegerloancompany@gmail.com
185michaelkruegerloancompany@gmail.co&#x0313;m
186ben525387@gmail.com
187dgill_pwc@mynet.com
188dgill_pwc1@terra.com
189tuthpala12@gmail.com
190johanthony1956@e-mail.museum
191christopher.white01@live.co.uk
192anitaloanfirm@live.com
193aliadamssolicitors@gmail.com
194jonathanevans000@yahoo.com
195jwatson494@yahoo.com
196ec21buyer@gmail.com
197sussanbien2012@gmail.com
198info@pavochenkofinance.tk
199honbarrijzdende@gmail.com
200ernestebi699@e-mail.ua
201siwei4489@yahoo.com.hk
202peterkoffi.info@gmail.com
203zenithbankplc106@yahoo.com
204fidelitybankplc505@aim.com
205kymcrox03@gmail.com
206esqharsmith2015@gmail.com
207facebooklottdepartment936@gmail.com
208lt_industries@outlook.com
209cpfi.ltd@live.nope
210changying33@yahoo.com
211abdoul0000hamid@gmail.com
212foreign_exchange@live.co.uk
213hdcliveuk@live.com
214fatimahhassan1@fengv.com
215mikejosephloanfirm202@gmail.com
216skyebanktg@rediffmail.com
217mrsbellafirm001@gmail.com
218financtreasury.uk@email.com
219admin@senagua.gob.ec
220m2424m@live.com
221stevewilliam197@gmail.com
222mrmathew.martins@yahoo.com
223benjaminwilliam917@gmail.com
224benja&#x031C;&#x030A;minwilliam917@gmail.com
225abe.shelton1@lenta.ru
226owengah@live.com
227dlserv01@aol.com
228ee.apala@gmail.com
229bbcpaydpt@live.com
230undpfn20114@gmail.com
231janievitek@gmail.com
232creditservice@careceo.com
233cying011@yahoo.com
234christophe_gbeffa@hotmail.fr
235christophe_gbeffa@hotmail.f&#x0301;r
236christophe_gbeffa@hotmail.fr&#x0301;
237maracasinter@yahoo.com
238iquad94@yahoo.com
239emil.jacobs@mail.com
240emil.jacob@mail.ru
241mgremittance.info@yahoo.co.uk
242raymondmorgan02@hotmail.com
243mrs_sabahibrahim@ymail.com
244drthomascole7@gmail.com
245barrp.agbo@outlook.fr
246mrsmorganhenlenloanfirm@gmail.com
247barr.njdmdcggroup@yahoo.com
248hknbddhb@gmail.com
249michelfoucault@outlook.fr
250michélfoucault@outlook.fr
251goldsupply@rediffmail.com
252dvdmumbai2000@gmail.com
253mikefinance02@gmail.com
254moonstoneking@gmail.com
255peterstone586@gmail.com
256denis_andre_phillipe@aol.com
257roberto.greco@aol.fr
258mark_grant112@hotmail.com
259nokiaxprizefoundationclaims@coolsite.net
260claims14_88@libero.it
261hon.leo.price@gmail.com
262info_unicef@consultant.com
263u_deliverycompany@yahoo.com
264eldhabiblamah152@gmail.com
265governorsanusi.lamido@yahoo.com.ph
266emyjean18@zipmail.com.br
267winningemail@luckymail.com
268barristervictor_odo@yahoo.com.ph
269nokia.global_promo@consultant.com
270headoffice_cv20448bd@libero.it
271ab.issah@yahoo.com
272ab_issah@yahoo.com.tw
273rifaatassad552@yahoo.com.hk
274barrsandilekhumalo@gmail.com
275gkiir@qq.nope
276ibrahimahmed3@aol.fr
277efccin@e-mail.ua
278dheerajrelan@gmail.com
279al-fardan@al-fardan-export.com
280mellissa000@hotmail.com
281verakones01@hotmail.com
282kivaloanfinance999@gmail.com
283atm.paydept00@outlook.com
284claudiokristiansen@yahoo.co.za
285info.kmf@gmx.com
286mambojames689@yahoo.co.uk
287a.salam2014bf@terra.com
288vanessappillip99@yahoo.com
289vanessaphillip@live.com
290alshat@emirates.net.ae
291</datafile>
292
293<datafile id="floats">
2949.7
29516.07
29627.675
29786.162
298189.36792
299859.073357
3001377.9901658
3011514.73870948
3022096.400730002
3032551.2050637982
3044615.26633110512
3058438.114838435104
30632036.61593959936
30736346.00047312989
308144826.22607192554
309+3.1eE5
310+4.992
311+2.425E+10
3129.5808eE10
3139.5808e10
314+0.416968e+0
315-0.3162108-0
316+0.03069882+0
317+0.132378721eE+-0
3180.43416726670
319+-0.43416726669e+0
320+-0.01976811464eE0
321-0.0197681146402e+-0
3220.02241943884633+0
323+-0.004803458640268eE-0
324+0.0008164744337844E+-0
3250.00266694045551024E+0
326+-0.0112132498185713980
3270.0003485919632198585e+-0
328-0.002599516682231249E+0
3290.02315181236174286E+0
330+0.0116575240311669+0
331+-0.06536499789006515eE+-0
332+20.914506804599366eE+-21
333+-20.062034167562416eE+20
33435.90964837611389E-1
335+-2.5508584172940916E-0
3360.6532888027107796eE0
337+0.02530509823216493E0
338-0.016818871414735502eE+-0
3390.01041535031385609E+0
340-0.017042043493346013eE0
341-0.015882934560610525eE0
342+-0.016271711916486607E+0
343-1.1521320712689072e-1
3440.5796638373356339+2
345-6.78321804536429e+-8
346+-18.6367662944200621
347+20.63224902663965eE21
348+-16.78193317331960417
34910.049610186973338-21
35064.51055985925869eE+-65
351+71.7394478831031eE+115
352+114.85412411903206eE-53
353+150.50431315365464e116
354-388.86846448777743eE+-334
355+-75.50343657758405E-76
356-75.50343657758405eE-151
357-216.9511816984773E176
358-175.798740561957eE-178
359+13.25998057047805113
360+3.745360060000819eE+27
361-27.329937066467846E23
36213.34390770072532E+35
363+34.68092648862783eE+-36
364+-35.6389454910375E-160
365+493.90278138088945eE+-1037
3661037.4462608675137+356
367-356.17279137431007E+983
368</datafile>
369
370<datafile id = "CRLF">line with CRLF &#13;&#10;two lines with LFCR &#10;&#13;final line
371</datafile>
372
373<grepcase regexp="^$" datafile="CRLF" grepcount="1"/>
374<grepcase regexp="^.*$" datafile="CRLF" grepcount="4"/>
375<grepcase regexp="" datafile="CRLF" grepcount="4"/>
376
377 <datafile id = "LU_test">
378The following line has LATIN CAPITAL LETTER G WITH MACRON in single quotes.
379'&#x1E20;'
380</datafile>
381
382<datafile id="4KiB-onepage">abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
383abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
384abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
385abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
386abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
387abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
388abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
389abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
390abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
391abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
392abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
393abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
394abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
395abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
396abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
397abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
398abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
399abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
400abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
401abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
402abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
403abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
404abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
405abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
406abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
407abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
408abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
409abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
410abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
411abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
412abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
413abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
414abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
415abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
416abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
417abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
418abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
419abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
420abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
421abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
422abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
423abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
424abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
425abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
426abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
427abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
428abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
429abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
430abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
431abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
432abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
433abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
434abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
435abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
436abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
437abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
438abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
439abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
440abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
441abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
442abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
443abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
444abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
445abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
446abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
447abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
448abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
449abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
450abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
451abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
452abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
453abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
454abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
455abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
456abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
457abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
458abcdefghijklmnopqrstuvwxyzABCDEFGhIJKLMNOPQRstuVWXYZ
459abcdefghijklmno</datafile>
460<grepcase regexp="ab" datafile="StartEndAlt" grepcount="4"/>
461<grepcase regexp="a*b" datafile="StartEndAlt" grepcount="10"/>
462<grepcase regexp="ab*" datafile="StartEndAlt" grepcount="15"/>
463<grepcase regexp="^user|^I|our$" datafile="StartEndAlt" grepcount="5"/>
464
465<grepcase regexp="fe|si" datafile="simple1" grepcount="3"/>
466<grepcase regexp="in" datafile="simple1" grepcount="2"/>
467<grepcase regexp="[A-Z]" datafile="simple1" grepcount="1"/>
468<grepcase regexp="fodder|simple" datafile="simple1" grepcount="2"/>
469<grepcase regexp="(?g)fodder|simple" datafile="simple1" grepcount="2"/>
470<grepcase regexp="\w{4}\s+\w{4}\s+\w{4}" datafile="simple1" grepcount="1"/>
471
472<grepcase regexp="[cde]{3}" datafile="bounded_charclass" grepcount="3"/>
473<grepcase regexp="[f-h]{5}" datafile="bounded_charclass" grepcount="3"/>
474<grepcase regexp="[a-z]{5}" datafile="bounded_charclass" grepcount="22"/>
475<grepcase regexp="[a-z]{5,15}" datafile="bounded_charclass" grepcount="22"/>
476<grepcase regexp="=[a-z]{7,}" datafile="bounded_charclass" grepcount="20"/>
477<grepcase regexp="=[a-z]{5,15};" datafile="bounded_charclass" grepcount="11"/>
478<grepcase regexp="=(?:[a-z]{3,5}){2,};" datafile="bounded_charclass" grepcount="21"/>
479<grepcase regexp="=(?:[a-z]{4,5}){2,};" datafile="bounded_charclass" grepcount="18"/>
480<grepcase regexp="(([wxy]{2}){3}){2}" datafile="bounded_charclass" grepcount="3"/>
481<grepcase regexp="(([wxy]{2}?){3}?){2}?" datafile="bounded_charclass" grepcount="3"/>
482<grepcase regexp="=([a-z][c-z])*;" datafile="bounded_charclass" grepcount="12"/>
483<grepcase regexp="[\u0061-\u007A]{6}" datafile="bounded_charclass" grepcount="21"/>
484<grepcase regexp="[\o{142}-d]{2}" datafile="bounded_charclass" grepcount="3"/>
485<grepcase regexp="[\x61-\U0000007A]{6}" datafile="bounded_charclass" grepcount="21"/>
486<grepcase regexp="(?i)[A-T]{6}" datafile="bounded_charclass" grepcount="15"/>
487<grepcase regexp="(?i)=S[A-T]S*;" datafile="bounded_charclass" grepcount="1"/>
488<grepcase regexp="=[0-9]{100};" datafile="bounded_charclass" grepcount="1"/>
489<grepcase regexp="=[0-9]{50,};" datafile="bounded_charclass" grepcount="9"/>
490<grepcase regexp="=[0-9]{107,};" datafile="bounded_charclass" grepcount="8"/>
491<grepcase regexp="=0123[0-9]{107,};" datafile="bounded_charclass" grepcount="7"/>
492<grepcase regexp="=[0-9]{299,};" datafile="bounded_charclass" grepcount="2"/>
493<grepcase regexp="=0123[0-9]{295,};" datafile="bounded_charclass" grepcount="2"/>
494<grepcase regexp="=[0-9]{140};" datafile="bounded_charclass" grepcount="1"/>
495<grepcase regexp="=[0-9a-z]{12,200};" datafile="bounded_charclass" grepcount="22"/>
496<grepcase regexp="=[0-9a-z]{200,1000};" datafile="bounded_charclass" grepcount="3"/>
497<grepcase regexp="=[0-9]{500,1000};" datafile="bounded_charclass" grepcount="1"/>
498<grepcase regexp="=0123[0-9]{496,996};" datafile="bounded_charclass" grepcount="1"/>
499<grepcase regexp="=([a-f].{0,2})+;" datafile="bounded_charclass" grepcount="6"/>
500
501
502<grepcase regexp="^D[zabcdefoy]g" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="7"/>
503<grepcase regexp="do*c|ez*t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="4"/>
504<grepcase regexp="M(az)*T" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="6"/>         
505<grepcase regexp="ez+t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="2" />
506<grepcase regexp="b([a-d]z)*t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="2"/>
507<grepcase regexp="[^D]og" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="2"/>
508<grepcase regexp="Na?t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="2"/>
509<grepcase regexp="h.t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="3" />
510<grepcase regexp="do*?c|ez*?t" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="4"/>
511<grepcase regexp="^.....\b" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="6"/>
512<grepcase regexp="^\X\X\X\X\X\b" datafile="RangeAltSeqMatchStarKplusWhileNotOptAny" grepcount="6"/>
513
514<grepcase regexp="[]]" datafile="special_characters" grepcount="9"/>
515<grepcase regexp="[-]" datafile="special_characters" grepcount="8"/>
516<grepcase regexp="[]^-]" datafile="special_characters" grepcount="14"/>
517<grepcase regexp="[\-\]\^]" datafile="special_characters" grepcount="14"/>
518<grepcase regexp="[^]]" datafile="special_characters" grepcount="16"/>
519<grepcase regexp="[^-]" datafile="special_characters" grepcount="15"/>
520<grepcase regexp="[^^]" datafile="special_characters" grepcount="16"/>
521<grepcase regexp="[^]-]" datafile="special_characters" grepcount="14"/>
522<grepcase regexp="[.]" datafile="special_characters" grepcount="7"/>
523<grepcase regexp=")" datafile="special_characters" grepcount="1"/>
524
525<grepcase regexp="^((([2][5][0-5]|([2][0-4]|[1][0-9]|[0-9])?[0-9])[.]){3})([2][5][0-5]|([2][0-4]|[1][0-9]|[0-9])?[0-9])$" datafile="ips" grepcount="60"/>
526<grepcase regexp="^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.([a-zA-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$" datafile="emails" grepcount="116"/>
527<!--<grepcase regexp="^(?g)[a-zA-Z0-9._%+-]+(?-g)@(?g)[a-zA-Z0-9.-]+(?-g)\.((?g)[a-zA-Z]{2}(?-g)|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$" datafile="emails" grepcount="120"/>-->
528<grepcase regexp="^[-+]?([1-9]0?)+\.?((0*[1-9])+|0)([eE][-+]?([0-9]+)+)?$" datafile="floats" grepcount="26"/>
529
530<!-- . should match a unique character, even if it is 3 bytes. -->
531<grepcase regexp="'.'" datafile="LU_test" grepcount="1"/>
532<grepcase regexp="'...'" datafile="LU_test" grepcount="0"/>
533<grepcase regexp="\u{1e20}" datafile="LU_test" grepcount="1"/>
534<grepcase regexp="\u1e20" datafile="LU_test" grepcount="1"/>
535<grepcase regexp="\U00001e20" datafile="LU_test" grepcount="1"/>
536<grepcase regexp="\o{17040}" datafile="LU_test" grepcount="1"/>
537<grepcase regexp="\u{1e21}" datafile="LU_test" grepcount="0"/>
538<grepcase regexp="\u1e21" datafile="LU_test" grepcount="0"/>
539<grepcase regexp="\U00001e21" datafile="LU_test" grepcount="0"/>
540<grepcase regexp="\o{17041}" datafile="LU_test" grepcount="0"/>
541<grepcase regexp="\p{Lu}" datafile="LU_test" grepcount="2"/>
542<grepcase regexp="'\p{Lu}'" datafile="LU_test" grepcount="1"/>
543<grepcase regexp="\p{Ll}" datafile="LU_test" grepcount="1"/>
544
545
546<datafile id="codepoints">
547A line with 0x89 &#x89;
548A line with 0x1234 &#x1234;
549A line with 0x1245 &#x1245;
550䜠
551奜
552àŠ•
553àŠ•à§€
554àŠ•àŠ•à§€
555àŠ•à§€àŠ•à§€
556àŠ•àŠ•à§€àŠ•àŠ•à§€àŠ•
557àŠ•à§€àŠ•àŠ•àŠ•àŠ•à§€
558A plain line.
559</datafile>
560<grepcase regexp="[\u{1234}-\u{1245}]" datafile="codepoints" grepcount="2"/>
561<grepcase regexp="[\u{086}-\u{9A}]" datafile="codepoints" grepcount="1"/>
562<grepcase regexp="[䜠奜]" datafile="codepoints" grepcount="2"/>
563<grepcase regexp="^\u{4F60}$" datafile="codepoints" grepcount="1"/>
564<grepcase regexp="(?g)^\u{4F60}$" datafile="codepoints" grepcount="1"/> 
565<grepcase regexp="^àŠ•$" datafile="codepoints" grepcount="1"/> 
566<!-- Bad tests:
567<grepcase regexp="(?g)^àŠ•$" datafile="codepoints" grepcount="2"/>
568<grepcase regexp="^àŠ•$" datafile="codepoints" grepcount="1"/> 
569<grepcase regexp="(?g)^àŠ•+$" datafile="codepoints" grepcount="6"/> 
570<grepcase regexp="^àŠ•{1,27}$" datafile="codepoints" grepcount="1"/> 
571<grepcase regexp="(?g)^àŠ•{1,27}$" datafile="codepoints" grepcount="6"/> 
572<grepcase regexp="(^àŠ•{1,2}$)|(^àŠ•{4,6}$)" datafile="codepoints" grepcount="1"/> 
573<grepcase regexp="(?g)(^àŠ•{1,2}$)|(^àŠ•{4,6}$)" datafile="codepoints" grepcount="6"/> 
574-->
575<datafile id = "EmptyFile"/>
576<grepcase regexp="ab" datafile="EmptyFile" grepcount="0"/> 
577<datafile id = "LineBreaking">CRLF1&#13;&#10;CRLF2&#13;&#10;CRLF3&#13;&#10;
578LS1&#x2028;LS2&#x2028;LS3&#x2028;PS1&#x2029;PS2&#x2029;PS3&#x2029;&#x2003;
579PS4&#x2029;CRLF4&#13;&#10;LS4&#x2028;LS5&#x2028;CRLF5&#13;&#10;LS6&#x2028;&#x2003;
580Unterminated</datafile>
581
582<grepcase regexp="^.*$" datafile="LineBreaking" grepcount="19"/>
583<grepcase regexp="^\X*$" datafile="LineBreaking" grepcount="19"/>
584<grepcase regexp="(?g)^.*$" datafile="LineBreaking" grepcount="19"/>
585<grepcase regexp="Unterminated$" datafile="LineBreaking" grepcount="1"/>
586<grepcase regexp="^CRLF.$" datafile="LineBreaking" grepcount="5"/>
587<grepcase regexp="LS[0-9]*" datafile="LineBreaking" grepcount="6"/>
588<grepcase regexp="PS" datafile="LineBreaking" grepcount="4"/>
589<grepcase regexp="\S" datafile="LineBreaking" grepcount="16"/>
590 
591<grepcase regexp="[a-z]{20}" datafile="4KiB-onepage" grepcount="77"/>
592<grepcase regexp="[a-z]{15}" datafile="4KiB-onepage" grepcount="78"/>
593
594
595<!-- The following data file is produced from auxiliary/GraphemeBreakTest.txt by
596(1) removing all comment lines and lines containing 0001, 000A, 000D or D800
597(2) removing all end-of-line comment data beginning #
598(3) embedding the codepoint hex values in XML hexadecimal character reference notation,
599(4) and deleting the whitespace and × ÷ separators -->
600<datafile id="graphemebreaktest">&#x0020;&#x0020;
601&#x0020;&#x0308;&#x0020;
602&#x0020;&#x0300;
603&#x0020;&#x0308;&#x0300;
604&#x0020;&#x0903;
605&#x0020;&#x0308;&#x0903;
606&#x0020;&#x1100;
607&#x0020;&#x0308;&#x1100;
608&#x0020;&#x1160;
609&#x0020;&#x0308;&#x1160;
610&#x0020;&#x11A8;
611&#x0020;&#x0308;&#x11A8;
612&#x0020;&#xAC00;
613&#x0020;&#x0308;&#xAC00;
614&#x0020;&#xAC01;
615&#x0020;&#x0308;&#xAC01;
616&#x0020;&#x1F1E6;
617&#x0020;&#x0308;&#x1F1E6;
618&#x0020;&#x0378;
619&#x0020;&#x0308;&#x0378;
620&#x0300;&#x0020;
621&#x0300;&#x0308;&#x0020;
622&#x0300;&#x0300;
623&#x0300;&#x0308;&#x0300;
624&#x0300;&#x0903;
625&#x0300;&#x0308;&#x0903;
626&#x0300;&#x1100;
627&#x0300;&#x0308;&#x1100;
628&#x0300;&#x1160;
629&#x0300;&#x0308;&#x1160;
630&#x0300;&#x11A8;
631&#x0300;&#x0308;&#x11A8;
632&#x0300;&#xAC00;
633&#x0300;&#x0308;&#xAC00;
634&#x0300;&#xAC01;
635&#x0300;&#x0308;&#xAC01;
636&#x0300;&#x1F1E6;
637&#x0300;&#x0308;&#x1F1E6;
638&#x0300;&#x0378;
639&#x0300;&#x0308;&#x0378;
640&#x0903;&#x0020;
641&#x0903;&#x0308;&#x0020;
642&#x0903;&#x0300;
643&#x0903;&#x0308;&#x0300;
644&#x0903;&#x0903;
645&#x0903;&#x0308;&#x0903;
646&#x0903;&#x1100;
647&#x0903;&#x0308;&#x1100;
648&#x0903;&#x1160;
649&#x0903;&#x0308;&#x1160;
650&#x0903;&#x11A8;
651&#x0903;&#x0308;&#x11A8;
652&#x0903;&#xAC00;
653&#x0903;&#x0308;&#xAC00;
654&#x0903;&#xAC01;
655&#x0903;&#x0308;&#xAC01;
656&#x0903;&#x1F1E6;
657&#x0903;&#x0308;&#x1F1E6;
658&#x0903;&#x0378;
659&#x0903;&#x0308;&#x0378;
660&#x1100;&#x0020;
661&#x1100;&#x0308;&#x0020;
662&#x1100;&#x0300;
663&#x1100;&#x0308;&#x0300;
664&#x1100;&#x0903;
665&#x1100;&#x0308;&#x0903;
666&#x1100;&#x1100;
667&#x1100;&#x0308;&#x1100;
668&#x1100;&#x1160;
669&#x1100;&#x0308;&#x1160;
670&#x1100;&#x11A8;
671&#x1100;&#x0308;&#x11A8;
672&#x1100;&#xAC00;
673&#x1100;&#x0308;&#xAC00;
674&#x1100;&#xAC01;
675&#x1100;&#x0308;&#xAC01;
676&#x1100;&#x1F1E6;
677&#x1100;&#x0308;&#x1F1E6;
678&#x1100;&#x0378;
679&#x1100;&#x0308;&#x0378;
680&#x1160;&#x0020;
681&#x1160;&#x0308;&#x0020;
682&#x1160;&#x0300;
683&#x1160;&#x0308;&#x0300;
684&#x1160;&#x0903;
685&#x1160;&#x0308;&#x0903;
686&#x1160;&#x1100;
687&#x1160;&#x0308;&#x1100;
688&#x1160;&#x1160;
689&#x1160;&#x0308;&#x1160;
690&#x1160;&#x11A8;
691&#x1160;&#x0308;&#x11A8;
692&#x1160;&#xAC00;
693&#x1160;&#x0308;&#xAC00;
694&#x1160;&#xAC01;
695&#x1160;&#x0308;&#xAC01;
696&#x1160;&#x1F1E6;
697&#x1160;&#x0308;&#x1F1E6;
698&#x1160;&#x0378;
699&#x1160;&#x0308;&#x0378;
700&#x11A8;&#x0020;
701&#x11A8;&#x0308;&#x0020;
702&#x11A8;&#x0300;
703&#x11A8;&#x0308;&#x0300;
704&#x11A8;&#x0903;
705&#x11A8;&#x0308;&#x0903;
706&#x11A8;&#x1100;
707&#x11A8;&#x0308;&#x1100;
708&#x11A8;&#x1160;
709&#x11A8;&#x0308;&#x1160;
710&#x11A8;&#x11A8;
711&#x11A8;&#x0308;&#x11A8;
712&#x11A8;&#xAC00;
713&#x11A8;&#x0308;&#xAC00;
714&#x11A8;&#xAC01;
715&#x11A8;&#x0308;&#xAC01;
716&#x11A8;&#x1F1E6;
717&#x11A8;&#x0308;&#x1F1E6;
718&#x11A8;&#x0378;
719&#x11A8;&#x0308;&#x0378;
720&#xAC00;&#x0020;
721&#xAC00;&#x0308;&#x0020;
722&#xAC00;&#x0300;
723&#xAC00;&#x0308;&#x0300;
724&#xAC00;&#x0903;
725&#xAC00;&#x0308;&#x0903;
726&#xAC00;&#x1100;
727&#xAC00;&#x0308;&#x1100;
728&#xAC00;&#x1160;
729&#xAC00;&#x0308;&#x1160;
730&#xAC00;&#x11A8;
731&#xAC00;&#x0308;&#x11A8;
732&#xAC00;&#xAC00;
733&#xAC00;&#x0308;&#xAC00;
734&#xAC00;&#xAC01;
735&#xAC00;&#x0308;&#xAC01;
736&#xAC00;&#x1F1E6;
737&#xAC00;&#x0308;&#x1F1E6;
738&#xAC00;&#x0378;
739&#xAC00;&#x0308;&#x0378;
740&#xAC01;&#x0020;
741&#xAC01;&#x0308;&#x0020;
742&#xAC01;&#x0300;
743&#xAC01;&#x0308;&#x0300;
744&#xAC01;&#x0903;
745&#xAC01;&#x0308;&#x0903;
746&#xAC01;&#x1100;
747&#xAC01;&#x0308;&#x1100;
748&#xAC01;&#x1160;
749&#xAC01;&#x0308;&#x1160;
750&#xAC01;&#x11A8;
751&#xAC01;&#x0308;&#x11A8;
752&#xAC01;&#xAC00;
753&#xAC01;&#x0308;&#xAC00;
754&#xAC01;&#xAC01;
755&#xAC01;&#x0308;&#xAC01;
756&#xAC01;&#x1F1E6;
757&#xAC01;&#x0308;&#x1F1E6;
758&#xAC01;&#x0378;
759&#xAC01;&#x0308;&#x0378;
760&#x1F1E6;&#x0020;
761&#x1F1E6;&#x0308;&#x0020;
762&#x1F1E6;&#x0300;
763&#x1F1E6;&#x0308;&#x0300;
764&#x1F1E6;&#x0903;
765&#x1F1E6;&#x0308;&#x0903;
766&#x1F1E6;&#x1100;
767&#x1F1E6;&#x0308;&#x1100;
768&#x1F1E6;&#x1160;
769&#x1F1E6;&#x0308;&#x1160;
770&#x1F1E6;&#x11A8;
771&#x1F1E6;&#x0308;&#x11A8;
772&#x1F1E6;&#xAC00;
773&#x1F1E6;&#x0308;&#xAC00;
774&#x1F1E6;&#xAC01;
775&#x1F1E6;&#x0308;&#xAC01;
776&#x1F1E6;&#x1F1E6;
777&#x1F1E6;&#x0308;&#x1F1E6;
778&#x1F1E6;&#x0378;
779&#x1F1E6;&#x0308;&#x0378;
780&#x0378;&#x0020;
781&#x0378;&#x0308;&#x0020;
782&#x0378;&#x0300;
783&#x0378;&#x0308;&#x0300;
784&#x0378;&#x0903;
785&#x0378;&#x0308;&#x0903;
786&#x0378;&#x1100;
787&#x0378;&#x0308;&#x1100;
788&#x0378;&#x1160;
789&#x0378;&#x0308;&#x1160;
790&#x0378;&#x11A8;
791&#x0378;&#x0308;&#x11A8;
792&#x0378;&#xAC00;
793&#x0378;&#x0308;&#xAC00;
794&#x0378;&#xAC01;
795&#x0378;&#x0308;&#xAC01;
796&#x0378;&#x1F1E6;
797&#x0378;&#x0308;&#x1F1E6;
798&#x0378;&#x0378;
799&#x0378;&#x0308;&#x0378;
800&#x0061;&#x1F1E6;&#x0062;
801&#x1F1F7;&#x1F1FA;
802&#x1F1F7;&#x1F1FA;&#x1F1F8;
803&#x1F1F7;&#x1F1FA;&#x1F1F8;&#x1F1EA;
804&#x1F1F7;&#x1F1FA;&#x200B;&#x1F1F8;&#x1F1EA;
805&#x1F1E6;&#x1F1E7;&#x1F1E8;
806&#x1F1E6;&#x200D;&#x1F1E7;&#x1F1E8;
807&#x1F1E6;&#x1F1E7;&#x200D;&#x1F1E8;
808&#x0020;&#x200D;&#x0646;
809&#x0646;&#x200D;&#x0020;
810</datafile>
811
812<grepcase regexp="^\X$" datafile="graphemebreaktest" grepcount="55"/>
813<!--<grepcase regexp="^\X\X$" datafile="graphemebreaktest" grepcount="153"/>
814<grepcase regexp="^\X{3}$" datafile="graphemebreaktest" grepcount="2"/>-->
815<grepcase regexp="^\X{4,}$" datafile="graphemebreaktest" grepcount="0"/>
816<!--<grepcase regexp=" \b{g}" datafile="graphemebreaktest" grepcount="28"/>
817<grepcase regexp=" \B{g}" datafile="graphemebreaktest" grepcount="13"/>-->
818<grepcase regexp="\x{1160}\b{g}" datafile="graphemebreaktest" grepcount="26"/>
819<grepcase regexp="\x{1160}\B{g}" datafile="graphemebreaktest" grepcount="14"/>
820<grepcase regexp="\b{g}\x{308}" datafile="graphemebreaktest" grepcount="0"/>
821<grepcase regexp="\B{g}\x{308}" datafile="graphemebreaktest" grepcount="100"/>
822
823<datafile id="hira_border">aa ba ca
824&#x3096;&#x309d;&#x002d;&#x3088;&#x308a;a
825&#x3096;a
826&#x3096; &#x3096;
827</datafile>
828<datafile id="hiragana_and_katakana">&#x3042;&#x3044;
829&#x3046;&#x3048;
830&#x304a;
831&#x30a2;&#x30a4;
832&#x30a6;&#x30a8;&#x30aa;
833</datafile>
834<datafile id="upper_lower_greek">&#x0391;&#x03b1;&#x0392;&#x03b2;&#x0393;&#x03b3;
835&#x0391;&#x0392;&#x0393;
836&#x03b1;&#x03b2;&#x03b3;
837</datafile>
838
839<grepcase regexp="\b{script=hira}a" datafile="hira_border" grepcount="2"/>
840<grepcase regexp="\b{script=hira}" datafile="hira_border" grepcount="3"/>
841<grepcase regexp="\p{script=/Hir./}" datafile="hira_border" grepcount="3"/>
842<grepcase regexp="\p{script=/.*Hir.*/}" datafile="hira_border" grepcount="3"/>
843<grepcase regexp="\p{script=/Hir.gana/}" datafile="hiragana_and_katakana" grepcount="3"/>
844<grepcase regexp="\p{script=/Kat.kana/}" datafile="hiragana_and_katakana" grepcount="2"/>
845<grepcase regexp="\p{script=/(Kata|Hira).ana/}" datafile="hiragana_and_katakana" grepcount="5"/>
846<grepcase regexp="\p{script=/(kata|Hira).ana/}" datafile="hiragana_and_katakana" grepcount="3"/>
847<grepcase regexp="(?:\p{greek}\p{greek}\p{greek})" datafile="upper_lower_greek" grepcount="3"/>
848
849<grepcase regexp="\p{name=/AIRPLANE/}" datafile="../All_good" grepcount="8"/>
850<grepcase regexp="[\N{GREEK CAPITAL LETTER ALPHA}-\N{Greek capital letter UPSILON with DIALYTIKA}]" grepcount="27"/>
851</greptest>
Note: See TracBrowser for help on using the repository browser.