Skip to content

Commit 5e57460

Browse files
meooow25Bodigrim
authored andcommitted
Allow list fusion for Text and Text.Lazy unpack
* Make Data.Text.unpack and Data.Text.Lazy.unpack good producers in list fusion. This allows them to fuse with good consumers of lists. Rewrite-back rules are included since the function bodies are large and we don't want to inline them if fusion doesn't occur. * For Data.Text.Lazy, this change means that `unpack`, which uses `unstreamList`, no longer fuses with `streamList` under Text's stream fusion framework. This scenario seems very unlikely, since nothing else must be done to the list in between the two functions. Even `pack . unpack` does not satisfy this rule. So we are not losing anything valuable here. * Add benchmarks for unpack, fusion and no fusion.
1 parent a721bf5 commit 5e57460

File tree

3 files changed

+50
-5
lines changed

3 files changed

+50
-5
lines changed

benchmarks/haskell/Benchmarks/Pure.hs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,14 @@ benchmark kind ~Env{..} =
208208
[ benchT $ nf (T.zipWith min tb) ta
209209
, benchTL $ nf (TL.zipWith min tlb) tla
210210
]
211+
, bgroup "length . unpack" -- length should fuse with unpack
212+
[ benchT $ nf (L.length . T.unpack) ta
213+
, benchTL $ nf (L.length . TL.unpack) tla
214+
]
215+
, bgroup "length . drop 1 . unpack" -- no list fusion because of drop 1
216+
[ benchT $ nf (L.length . L.drop 1 . T.unpack) ta
217+
, benchTL $ nf (L.length . L.drop 1 . TL.unpack) tla
218+
]
211219
, bgroup "length"
212220
[ bgroup "cons"
213221
[ benchT $ nf (T.length . T.cons c) ta

src/Data/Text/Lazy.hs

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -424,7 +424,23 @@ unpack ::
424424
#endif
425425
Text -> String
426426
unpack t = S.unstreamList (stream t)
427-
{-# INLINE [1] unpack #-}
427+
{-# NOINLINE unpack #-}
428+
429+
foldrFB :: (Char -> b -> b) -> b -> Text -> b
430+
foldrFB = foldr
431+
{-# INLINE [0] foldrFB #-}
432+
433+
-- List fusion rules for `unpack`:
434+
-- * `unpack` rewrites to `build` up till (but not including) phase 1. `build`
435+
-- fuses if `foldr` is applied to it.
436+
-- * If it doesn't fuse: In phase 1, `build` inlines to give us `foldrFB (:) []`
437+
-- and we rewrite that back to `unpack`.
438+
-- * If it fuses: In phase 0, `foldrFB` inlines and `foldr` inlines. GHC
439+
-- optimizes the fused code.
440+
{-# RULES
441+
"Text.Lazy.unpack" [~1] forall t. unpack t = Exts.build (\lcons lnil -> foldrFB lcons lnil t)
442+
"Text.Lazy.unpackBack" [1] foldrFB (:) [] = unpack
443+
#-}
428444

429445
-- | /O(n)/ Convert a literal string into a Text.
430446
unpackCString# :: Addr# -> Text

src/Data/Text/Show.hs

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ import Data.Text.Internal.Encoding.Utf8 (utf8Length)
3131
import Data.Text.Internal.Unsafe.Char (unsafeWrite)
3232
import Data.Text.Unsafe (Iter(..), iterArray)
3333
import GHC.Exts (Ptr(..), Int(..), Addr#, indexWord8OffAddr#)
34+
import qualified GHC.Exts as Exts
3435
import GHC.Word (Word8(..))
3536
import qualified Data.Text.Array as A
3637
#if !MIN_VERSION_ghc_prim(0,7,0)
@@ -53,12 +54,32 @@ unpack ::
5354
HasCallStack =>
5455
#endif
5556
Text -> String
56-
unpack (Text arr off len) = go off
57+
unpack t = foldrText (:) [] t
58+
{-# NOINLINE unpack #-}
59+
60+
foldrText :: (Char -> b -> b) -> b -> Text -> b
61+
foldrText f z (Text arr off len) = go off
5762
where
5863
go !i
59-
| i >= off + len = []
60-
| otherwise = let !(Iter c l) = iterArray arr i in c : go (i + l)
61-
{-# INLINE [1] unpack #-}
64+
| i >= off + len = z
65+
| otherwise = let !(Iter c l) = iterArray arr i in f c (go (i + l))
66+
{-# INLINE foldrText #-}
67+
68+
foldrTextFB :: (Char -> b -> b) -> b -> Text -> b
69+
foldrTextFB = foldrText
70+
{-# INLINE [0] foldrTextFB #-}
71+
72+
-- List fusion rules for `unpack`:
73+
-- * `unpack` rewrites to `build` up till (but not including) phase 1. `build`
74+
-- fuses if `foldr` is applied to it.
75+
-- * If it doesn't fuse: In phase 1, `build` inlines to give us
76+
-- `foldrTextFB (:) []` and we rewrite that back to `unpack`.
77+
-- * If it fuses: In phase 0, `foldrTextFB` inlines and `foldrText` inlines. GHC
78+
-- optimizes the fused code.
79+
{-# RULES
80+
"Text.unpack" [~1] forall t. unpack t = Exts.build (\lcons lnil -> foldrTextFB lcons lnil t)
81+
"Text.unpackBack" [1] foldrTextFB (:) [] = unpack
82+
#-}
6283

6384
-- | /O(n)/ Convert a null-terminated
6485
-- <https://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8 modified UTF-8>

0 commit comments

Comments
 (0)