diff -urwpN regex-base-0.71/doc/lazy.html ghc-6.6.1/libraries/regex-base/doc/lazy.html --- regex-base-0.71/doc/lazy.html 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/doc/lazy.html 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,139 @@ + + +
++For simplest use of the new API: import Text.Regex.Lazy and one of +
+import Text.Regex.PCRE((=~),(=~~)) +import Text.Regex.Parsec((=~),(=~~)) +import Text.Regex.DFA((=~),(=~~)) +import Text.Regex.PosixRE((=~),(=~~)) +import Text.Regex.TRE((=~),(=~~)) ++The things you can demand of (=~) and (=~~) are all +instance defined in Text.Regex.Impl.Context and they are used +in Example.hs as well. +
+
+You can redefine (=~) and (=~~) to use different options by using makeRegexOpts: +
+(=~) :: (RegexMaker Regex CompOption ExecOption source,RegexContext Regex source1 target) => source1 -> source -> target +(=~) x r = let q :: Regex + q = makeRegexOpts (some compoption) (some execoption) r + in match q x + +(=~~) ::(RegexMaker Regex CompOption ExecOption source,RegexContext Regex source1 target,Monad m) => source1 -> source -> m target +(=~~) x r = let q :: Regex + q = makeRegexOpts (some compoption) (some execoption) r + in matchM q x ++There is a medium level API with functions compile/execute/regexec in +all the Text.Regex.*.(String|ByteString) modules. These allow for +errors to be reported as Either types when compiling or running. +
+The low level APIs are in the Text.Regex.*.Wrap modules. For the +c-library backends these expose most of the c-api in wrap* functions +that make the type more Haskell-like: CString and CStingLen and +newtypes to specify compile and execute options. The actual foreign +calls are not exported; it does not export the raw c api. +
+Also, Text.Regex.PCRE.Wrap will let you query if it was compiled with +UTF8 suppor: configUTF8 :: Bool. But I do not provide a way +to marshall to or from UTF8. (If you have a UTF8 ByteString then you +would probably be able to make it work, assuming the indices PCRE uses +are in bytes, otherwise look at the wrap* functions which are a thin +layer over the pcreapi). +
+ +
+The old Text.Regex API is can be replaced. If you need to be drop in +compatible with Text.Regex then you can +import Text.Regex.New and report any infidelities as bugs. + +Some advantages of Text.Regex.Parsec over Text.Regex: +
+Internally it uses Parsec to turn the string regex into +a Pattern data type, simplify the Pattern, then +transform the Pattern into a Parsec parser that +accepts matching strings and stores the sub-strings of parenthesized +groups. +
+All of this was motivated by the inability to use Text.Regex +to complete +the regex-dna +benchmark on The +Computer Language Shootout. The current entry there, by Don +Stewart and Alson Kemp and Chris Kuklewicz, does not use this Parsec +solution, but rather a custom DFA lexer from the CTK library. + + diff -urwpN regex-base-0.71/doc/README ghc-6.6.1/libraries/regex-base/doc/README --- regex-base-0.71/doc/README 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/doc/README 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,39 @@ +README for TestRegexLazy-0.66 + +By Chris Kuklewicz (TextRegexLazy (at) personal (dot) mightyreason (dot) com) + +For more detail on Text.Regex.Lazy look at the very very outdated +lazy.html file or the LICENSE file. + +To build and install: + get Data.ByteString from http://www.cse.unsw.edu.au/~dons/fps.html + (You probably want to configure ByteString's cabal with -p for profiling) + edit list of BACKENDS in Makefile if you want to exclude regex-tre or regex-pcre + edit regex-pcre/regex-pcre.cabal to point to your PCRE installation + edit CONF and USER variables in Makefile to point to your setup + (The CONF includes -p for profiling) + run "make all" which will create and install all the packages in $(SUBDIRS) + +The packages: + regex-base : This hold the type class definitions and (most) RegexContext,Extract instances + regex-compat : Builds Text.Regex.New (soon to replace Text.Regex) on top of regex-parsec + regex-pcre : Build the PCRE backend, http://www.pcre.org/ + regex-posix : Builds the Posix backend + regex-parsec : Builds my lazy parsec based pure haskell backend + regex-dfa : Build the simple backend based on CTKLight (this is LGPL) + +There is an additional "regex-devel" package where I am setting up +testing and bechmarking. Use "make regex-devel" at the top level to +compile (not install), or use its cabal Setup.hs. +regex-devel/bench/runbench.sh is my simple toy benchmark. + +To use =~ and =~~ new API: + +> import Text.Regex.(Parsec|DFA|PCRE|PosixRE|TRE) +and perhaps +> import Text.Regex.Base + +Look at Example*.hs and instances in Text.Regex.Base.Context.hs for what it can do. + +For old "Text.Regex" API drop in compatibility, import Text.Regex.New (uses PosixRE backend) + diff -urwpN regex-base-0.71/doc/Redesign.txt ghc-6.6.1/libraries/regex-base/doc/Redesign.txt --- regex-base-0.71/doc/Redesign.txt 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/doc/Redesign.txt 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,14 @@ +The regular expression stuff needs some of a rethink. + +Things that could be made more efficient, as I think of them: + +(1) Making Arrays in Wrap* may be a bit inefficient +counter: Usage may be like "look up element 3" so random access is good + +(2) String DFA: the findRegex computes the prefix string itself, which is sometimes wasted / sometimes wanted / always discarded. Also, the input string at the start of the match is discarded + +(3) Lazy computes MatchedStrings array then discards it. Wasteful. + +(4) Mighty extend RegexLike with ability to return "strings", i.e. Extract instance. The default conversion could be left in for some things. Then RegexContext could pull from that instead of matchOnce/matchAll. + +(5) make RegexLike default matchAll/matchOnce in terms of matchOnceText and matchAllText diff -urwpN regex-base-0.71/examples/Example2.hs ghc-6.6.1/libraries/regex-base/examples/Example2.hs --- regex-base-0.71/examples/Example2.hs 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/examples/Example2.hs 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,44 @@ +{-# OPTIONS_GHC -fglasgow-exts #-} +import Text.Regex.Base +import Text.Regex.Posix(Regex,(=~),(=~~)) -- or DFA or PCRE or PosixRE +import qualified Data.ByteString.Char8 as B(ByteString,pack) + +-- Show mixing of ByteString and String as well as polymorphism: + +main = let x :: (RegexContext Regex String target) => target + x = ("abaca" =~ B.pack "(.)a") + x' :: (RegexContext Regex String target,Monad m) => m target + x' = ("abaca" =~~ "(.)a") + y :: (RegexContext Regex B.ByteString target) => target + y = (B.pack "abaca" =~ "(.)a") + y' :: (RegexContext Regex B.ByteString target,Monad m) => m target + y' = (B.pack "abaca" =~~ B.pack "(.)a") + in do print (x :: Bool) + print (x :: Int) + print (x :: [MatchArray]) + print (x' :: Maybe (String,String,String,[String])) + print (y :: Bool) + print (y :: Int) + print (y :: [MatchArray]) + print (y' :: Maybe (B.ByteString,B.ByteString,B.ByteString,[B.ByteString])) + +{- Output is, except for replacing Full with DFA (which has no capture) +True +2 +[array (0,1) [(0,(1,2)),(1,(1,1))],array (0,1) [(0,(3,2)),(1,(3,1))]] +Just ("a","ba","ca",["b"]) +True +2 +[array (0,1) [(0,(1,2)),(1,(1,1))],array (0,1) [(0,(3,2)),(1,(3,1))]] +Just ("a","ba","ca",["b"]) +-} +{- The output for DFA is +True +2 +[array (0,0) [(0,(1,2))],array (0,0) [(0,(3,2))]] +Just ("a","ba","ca",[]) +True +2 +[array (0,0) [(0,(1,2))],array (0,0) [(0,(3,2))]] +Just ("a","ba","ca",[]) +-} diff -urwpN regex-base-0.71/examples/Example3.lhs ghc-6.6.1/libraries/regex-base/examples/Example3.lhs --- regex-base-0.71/examples/Example3.lhs 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/examples/Example3.lhs 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,21 @@ +> {-# OPTIONS_GHC -fglasgow-exts #-} + +> import Text.Regex.Base + +> import qualified Text.Regex.PCRE as R +> import qualified Text.Regex.PosixRE as S +> import qualified Text.Regex.Parsec as F + +Choose which library to use depending on presence of PCRE library. + +> (=~) :: (RegexMaker R.Regex R.CompOption R.ExecOption a,RegexContext R.Regex b t +> ,RegexMaker F.Regex F.CompOption F.ExecOption a,RegexContext F.Regex b t +> ,RegexMaker S.Regex S.CompOption S.ExecOption a,RegexContext S.Regex b t) +> => b -> a -> t +> (=~) = case R.getVersion of +> Just _ -> (R.=~) +> Nothing -> case S.getVersion of +> Just _ -> (S.=~) +> Nothing -> (F.=~) + +> main = print ("abc" =~ "(.)c" :: Bool) \ No newline at end of file diff -urwpN regex-base-0.71/examples/Example.hs ghc-6.6.1/libraries/regex-base/examples/Example.hs --- regex-base-0.71/examples/Example.hs 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/examples/Example.hs 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,14 @@ +{-# OPTIONS_GHC -fglasgow-exts #-} +import Text.Regex.Base +import Text.Regex.Posix((=~),(=~~)) -- or DFA or PCRE or PosixRE +import qualified Data.ByteString.Char8 as B(ByteString,pack) + +main = let b :: Bool + b = ("abaca" =~ "(.)a") + c :: [MatchArray] + c = ("abaca" =~ "(.)a") + d :: Maybe (String,String,String,[String]) + d = ("abaca" =~~ "(.)a") + in do print b + print c + print d diff -urwpN regex-base-0.71/Makefile ghc-6.6.1/libraries/regex-base/Makefile --- regex-base-0.71/Makefile 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/Makefile 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,20 @@ +TOP=.. +include $(TOP)/mk/boilerplate.mk + +SUBDIRS = + +ALL_DIRS = \ + Text/Regex \ + Text/Regex/Base + +PACKAGE = regex-base +VERSION = 0.72 +PACKAGE_DEPS = base + +EXCLUDED_SRCS = Setup.hs + +SRC_HC_OPTS += -cpp + +SRC_HADDOCK_OPTS += -t "Haskell Hierarchical Libraries ($(PACKAGE) package)" + +include $(TOP)/mk/target.mk diff -urwpN regex-base-0.71/package.conf.in ghc-6.6.1/libraries/regex-base/package.conf.in --- regex-base-0.71/package.conf.in 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/package.conf.in 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1,27 @@ +name: PACKAGE +version: VERSION +license: BSD3 +maintainer: TextRegexLazy@personal.mightyreason.com +exposed: True + +exposed-modules: Text.Regex.Base + Text.Regex.Base.RegexLike + Text.Regex.Base.Context + Text.Regex.Base.Impl + +hidden-modules: + +import-dirs: IMPORT_DIR +library-dirs: LIB_DIR +hs-libraries: "HSregex-base" +extra-libraries: +include-dirs: +includes: +depends: base +hugs-options: +cc-options: +ld-options: +framework-dirs: +frameworks: +haddock-interfaces: HADDOCK_IFACE +haddock-html: HTML_DIR diff -urwpN regex-base-0.71/prologue.txt ghc-6.6.1/libraries/regex-base/prologue.txt --- regex-base-0.71/prologue.txt 1970-01-01 01:00:00.000000000 +0100 +++ ghc-6.6.1/libraries/regex-base/prologue.txt 2007-04-25 18:24:10.000000000 +0100 @@ -0,0 +1 @@ +Interfaces for regular expressions diff -urwpN regex-base-0.71/regex-base.cabal ghc-6.6.1/libraries/regex-base/regex-base.cabal --- regex-base-0.71/regex-base.cabal 2006-12-05 18:29:02.000000000 +0000 +++ ghc-6.6.1/libraries/regex-base/regex-base.cabal 2007-04-25 18:24:10.000000000 +0100 @@ -2,7 +2,7 @@ -- To fix for cabal < 1.1.4 comment out the Extra-Source-Files line -- **************************************************************** Name: regex-base -Version: 0.71 +Version: 0.72 -- Cabal-Version: >=1.1.4 License: BSD3 License-File: LICENSE @@ -28,7 +28,7 @@ Buildable: True -- Other-Modules: -- ********* Be backward compatible until 6.4.2 is futher deployed -- HS-Source-Dirs: "." -Extensions: MultiParamTypeClasses, FunctionalDependencies +Extensions: MultiParamTypeClasses, FunctionalDependencies, CPP -- GHC-Options: -Wall -Werror GHC-Options: -Wall -Werror -O2 -- GHC-Options: -Wall -ddump-minimal-imports diff -urwpN regex-base-0.71/Text/Regex/Base/Context.hs ghc-6.6.1/libraries/regex-base/Text/Regex/Base/Context.hs --- regex-base-0.71/Text/Regex/Base/Context.hs 2006-12-05 18:29:02.000000000 +0000 +++ ghc-6.6.1/libraries/regex-base/Text/Regex/Base/Context.hs 2007-04-25 18:24:10.000000000 +0100 @@ -185,9 +185,12 @@ instance (RegexLike a b) => RegexContext match r s = maybe (-1,0) (!0) (matchOnce r s) matchM r s = maybe regexFailed (return.(!0)) (matchOnce r s) +#if __GLASGOW_HASKELL__ +-- overlaps with instance (RegexLike a b) => RegexContext a b (Array Int b) instance (RegexLike a b) => RegexContext a b MatchArray where match r s = maybe nullArray id (matchOnce r s) matchM r s = maybe regexFailed return (matchOnce r s) +#endif instance (RegexLike a b) => RegexContext a b (b,MatchText b,b) where match r s = maybe (s,nullArray,empty) id (matchOnceText r s) @@ -216,21 +219,27 @@ instance (RegexLike a b) => RegexContext , mrMatch = whole , mrAfter = post , mrSubs = fmap fst ma - , mrSubList = tail (map fst subs) }) + , mrSubList = map fst subs }) -- ** Instances based on matchAll,matchAllText +#if __GLASGOW_HASKELL__ +-- overlaps with instance (RegexLike a b) => RegexContext a b [Array Int b] instance (RegexLike a b) => RegexContext a b [MatchArray] where match = matchAll matchM = nullFail +#endif instance (RegexLike a b) => RegexContext a b [MatchText b] where match = matchAllText matchM = nullFail +#if __GLASGOW_HASKELL__ +-- overlaps with instance (RegexLike a b) => RegexContext a b [b] instance (RegexLike a b) => RegexContext a b [(MatchOffset,MatchLength)] where match r s = [ ma!0 | ma <- matchAll r s ] matchM = nullFail +#endif instance (RegexLike a b) => RegexContext a b [b] where match r s = [ fst (ma!0) | ma <- matchAllText r s ]