A library for creating scalacheck generators from regular expressions
Cross-built for Scala 2.12/2.13/3.1/3.2
In your build.sbt
libraryDependencies += "io.github.wolfendale" %% "scalacheck-gen-regexp" % "[VERSION]"
import wolfendale.scalacheck.regexp.RegexpGen
val generator: Gen[String] = RegexpGen.from("[1-9]\\d?(,\\d{3})+")
Feature | Example | Notes |
---|---|---|
Literals | a , \\w , 7 |
Literals are transformed into constant generators |
Character Classes | [abc] , [^abc] , [a-zA-Z0-9] |
Character classes are transformed with Gen.oneOf |
Default Classes | \w , \d , \S , . |
These are treated as predefined character classes |
Quantifiers | a? , b+ , c* , d{3} , e{4,5} , f{5,} |
These use Gen.listOfN to create sized lists of the preceding term |
Groups | (abc) , (?:def) |
Backreferences are not supported, groups can only be used for grouping terms |
Alternates | a|b|c , a(b|c)d |
Alternates are also transformed with Gen.oneOf |
Boundaries | ^ , $ , \b |
Although these will be parsed they do not modify the generator output |
Feature | Example | Notes |
---|---|---|
Backreferences | ([ab]\1) |
With the current implementation there's no simple way to do this, definitely in consideration for a future release |
Octal / Hex / Special Literals | \012 , \xF1 , \p{Lower} |
Most of these should be simple to implement but I wanted to get an initial release created first |
Character Class Intersection | [a&&[b]] , [a[b]] |
Difficult to implement currently but not impossible, definite consideration for a future release |
-
In order to represent any character,
RegexpGen#from
takes an implicitArbitrary[Char]
. There is a default instance provided by scalacheck however for most uses you probably want to provide your own. -
If you use the
+
or*
quantifiers you'll end up getting huge variance in string sizes. If this isn't what you want, consider bounding the lengths of certain string segments with the{min,max}
quantifier. -
Negated character classes / default classes are implemented by generating an arbitrary
Char
within certain bounds viasuchThat
, because of this you can end up throwing away a lot of cases and in certain circumstances your tests may fail. Try to refactor out negative cases. -
In character classes each option is given equal weighting. If you'd prefer to weight a particular entry you can add it multiple times, this is made easier with string interpolation:
s"[${"a-z"*5}\s]"
. In the example case the generator is 5 times more likely to generate an alpha character than a space.