weixin_39622521
weixin_39622521
2021-01-04 05:12

Unlifted Datatypes (under review)

The proposal has been accepted; the following discussion is mostly of historic interest.

Rendered

该提问来源于开源项目:ghc-proposals/ghc-proposals

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

57条回答

  • weixin_39622521 weixin_39622521 4月前

    can unlifted data be allocated directly into compact region?

    No. It has to be copied in the same manner as lifted data.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I think I resolved all requests for revision in my last commit, at least I hope so. Can we hand over to the committee again, please?

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Could consider handing this back to the committee? I'm not currently aware of any open TODOs on my side.

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    could you check that your comments are addressed now?

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    could you check that your comments are addressed now?

    I have submitted a raft of drafting suggestions, all designed to make the proposal easier to understand.

    I'm content with the technical content. So yes, once these edits are made to Simon M's satisfaction, I'm happy to approve.

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    Great, I think we can now accept this.

    点赞 评论 复制链接分享
  • weixin_39595302 weixin_39595302 4月前

    It sounds like we're more-or-less in agreement about the important things. Thanks for writing this up and trying to tackle this. One other minor concern is that the proposal suggests both a GADT-syntax and a H98 syntax for unlifted data types:

    data unlifted Foo = ... --- H98
    data Foo :: TYPE 'UnliftedRep where -- GADT
    

    My concern is that the unlifted keyword is optional when using GADT syntax. Because of the kind signature, the meaning is still clear, but this would establish a precedent that a kind signature can change the runtime behavior of a program. If that user had instead written:

    data Foo where
    

    The runtime behavior of their code would be different. CUSKs on data types can currently influence where or not the compiler accepts them, but they never change the operational semantics of a program. (I could be wrong about this though.) I think it would be more consistent to require:

    data unlifted Foo :: TYPE 'UnliftedRep where -- GADT
    

    I don't think this is big deal. It's my aesthetic preference, but I thought I'd mention it in case it resonated with anyone else.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I didn't have a strong opinion on this before and wasn't sure what would be the best solution. But I agree now that we should probably require the unlifted keyword either way and that the KindSignature (if any) should match that keyword.

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    Under "motivation" the proposal says

    
    data unlifted Tree a
      = Branch !(Tree a) a !(Tree a)
      | Leaf
    

    But in this defintion the ! are redundant. Just omit them!

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Ha, good point!

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    Now for the rest of this proposal, we assume that the plan outlined in #16970 has landed, in particular for its operational consequence that strict constructor fields always contain tagged pointers (meaning their tag is never zero, like for thunks).

    I think that patch is 100% irrelevant to this proposal.

    In this proposal, if T is delcared with data unlifted T =..., then a value of type T is always represented by a properly-tagged pointer to the value, period. Just like Array# t today. The MR #16970 concerns strict fields of a data constructor, where the strict field is a lifted data type, say Bool. Thus:

    
    data Ordinary = MkOrdinary !Bool Int
    

    Here the patch adds a new invariant that although some Bool-values can be bottom, and represented by an untagged pointer, the Bool in a MkOrdinary is always always a properly-tagged pointer to a boolean value.

    Nothing to do with this proposal!

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    topdecl -> 'data' 'instance' [ 'unlifted' ] [ context => ] ...

    No, I don't think you can do this for data instances -- unless you make the data family (and hence all its instances) unlifted. OK you can do that, but please spell it out.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I think that patch is 100% irrelevant to this proposal.

    Correct. I deleted that paragraph.

    No, I don't think you can do this for data instances -- unless you make the data family (and hence all its instances) unlifted. OK you can do that, but please spell it out.

    We don't have to do anything specific for data families (well, other than making sure result kinds match up), the generalisation already landed with the UnliftedNewtypes proposal. I added a paragraph to the static semantics section:

    Data family instances may be declared unlifted, in which case the data family application's result kind must reduce to TYPE 'UnliftedRep. See the section on data families in the UnliftedNewtypes proposal and Note [Implementation of UnliftedNewtypes] for details involving type-checking the parent data family.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Here's a very rough prototype that miraculously just works on the following file:

    haskell
    {-# LANGUAGE UnliftedDatatypes #-}
    {-# LANGUAGE MagicHash #-}
    
    module Lib where
    
    import GHC.Exts
    
    pack :: Bool -> Bool -> Int#
    pack False False = 0#
    pack False True  = 1#
    pack True  False = 2#
    pack True  True  = 3#
    
    data unlifted SBool = STrue | SFalse
    
    spack :: SBool -> SBool -> Int#
    spack SFalse SFalse = 0#
    spack SFalse STrue  = 1#
    spack STrue  SFalse = 2#
    spack STrue  STrue  = 3#
    
    data Tree a
      = Branch !(Tree a) a !(Tree a)
      | Leaf
    
    tsum :: Tree Int -> Int
    tsum Leaf           = 0
    tsum (Branch l x r) = tsum l + x + tsum r
    
    data unlifted STree a
      = SBranch (STree a) a (STree a)
      | SLeaf
    
    stsum :: STree Int -> Int
    stsum SLeaf           = 0
    stsum (SBranch l x r) = stsum l + x + stsum r
    

    And it generates exactly the Cmm we expect it to generate (so no zero tag checks for unlifted variants).

    点赞 评论 复制链接分享
  • weixin_39947812 weixin_39947812 4月前

    It occurred to me that while we want these types to be unlifted, I think we don't want them to be UnliftedRep as it's more restrictive than desirable.

    In particular it's not uncommon to have top level static data. Either explicitly user written, or floated to the top during code transformations.
    Something not possible for things of UnliftedRep currently.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I think the solution to the top-level problem is not declaring them lifted, but rather making sure that we only allow unlifted normal forms (NB: not only WHNF) at the top-level. Problem solved!

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    This is now ready for review by the committee. Is there anything more I have to do?

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    I'm having trouble with the worker-wrapper example. The proposal suggests that we could end up with

    
    foo :: Int -> SPair Int Int
    foo x = case $wfoo x of (# Force a, Force b #) -> SPair a b
    
    $wfoo :: Int -> (# Strict Int, Strict Int #)
    $wfoo x
      | even x
      = (# Force (x + 1), Force x #)
      | otherwise
      = case $wfoo (x-1) of
          (# Force a, Force b #) -> (# Force (a+1), Force (b+1) #)
    

    But every Force here is a box, so this doesn't actually help unless there's also some mechanism to elide these boxes. Indeed the proposal calls out this lack in the section on Effect and Interactions:

    This proposal consciously left out [...] details of whether we should eliminate the indirection in constructors like Force (we certainly should!) and to what degree we could infer and let the user omit Force constructors.

    so given that this isn't part of the current proposal, is the work-wrapper example relevant here?

    Moreover, the pack example suggests

    we can wrap all Bool s in Strict (see above) or define a new unlifted SBool

    but this doesn't work with Strict, due to the box.

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    It occurred to me that while we want these types to be unlifted, I think we don't want them to be UnliftedRep as it's more restrictive than desirable.

    It's not clear to me how you could have unlifted types without them having kind UnliftedRep. It's precisely the restrictions on types of kind TYPE 'UnliftedRep that give us the guarantee that a value of that type is always a tagged pointer to the object, never a thunk or indirection.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Re: WW: Yes, I wrote the WW stuff when I was writing https://github.com/ghc-proposals/ghc-proposals/pull/257, which is basically succeeded by this proposal. At the time I was pretty convinced that we could always get rid of the extra indirection, which turns out not to be true (see the code block in https://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-552865750). WW'ing for Strict doesn't make sense if we would have to introduce a box. Same for the pack example. I'll update accordingy.

    Re: UnliftedRep, yes, I'm not sure what Andreas was alluding to. But this was before the discussion in #17521, so I think that comment is no longer relevant.

    点赞 评论 复制链接分享
  • weixin_39960700 weixin_39960700 4月前
    • Given that we cannot have top-level unlifted bindings, are we allowed to have nullary constructors of unlifted datatypes?

    • The proposal mentions kind #, which was gone as of GHC 8.0. I think you mean TYPE UnliftedRep. (Actually, all RuntimeReps other than LiftedRep are call-by-value, but we have no common way to refer to the lot of them.)

    • This potentially steals syntax, in case someone has a datatype context that begins with a type variable named unlifted. No one has ever done this, I'm sure. But there may be a small parsing hiccup.

    • I'm a bit bothered by the fact that -XUnliftedNewtypes uses a kind annotation (optionally) to control the kind of the type, while this proposal uses a new keyword. I would prefer that this proposal uses a kind annotation data UPair a b :: TYPE UnliftedRep where .... (Yes, I know this means that H98 syntax is not allowed. Although there's no reason we cannot have kind annotations on H98 syntax. I think.) We could provide a type synonym to avoid the scary TYPE: data UPair a b :: UnliftedData where ....

    • If we are going to have data unlifted, I'd prefer to also have data lifted. I'm frequently bothered by GHC's asymmetry between laziness and strictness and would sometimes prefer to have GHC require me to label everything as either lazy or strict (and thus force me to think about what's better). This isn't possible today. (For example, we can write data MixedPair a b = MP ~a !b only with -XStrictData, which sadly also changes the default.) More symmetry is better! But I'd still prefer using kind annotations.

    点赞 评论 复制链接分享
  • weixin_39595302 weixin_39595302 4月前

    I'd like to push back a little bit on the fourth point Richard makes (the one about using a kind annotation instead of a keyword). Earlier in this thread, referring to using just a kind signature to change the levity of a data type, I wrote:

    ... this would establish a precedent that a kind signature can change the runtime behavior of a program. If that user had instead written ... the runtime behavior of their code would be different. CUSKs on data types can currently influence where or not the compiler accepts them, but they never change the operational semantics of a program. (I could be wrong about this though.)

    Although UnliftedNewtypes uses optional kind annotations, the big difference is that, in that setting, there is only one possible kind for a newtype declaration, so adding the kind signature does not influence the behavior of a well-typed program. With UnliftedDatatypes, this isn't the case. If the levity of a data type can be specified by solely the kind signature, then that kind signature could change the behavior of a well-typed program. And the fact that that could happen bothers me (only a little bit).

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前
    • Given that we cannot have top-level unlifted bindings, are we allowed to have nullary constructors of unlifted datatypes?

    Bummer! Indeed that is a problem for data con wrappers. Currently, if we define

    haskell
    {-# LANGUAGE UnliftedDatatypes #-}
    {-# LANGUAGE GADTs #-}
    
    module Lib where
    
    data unlifted T a where
      TInt :: T Int
    

    We get the following data con wrapper (with the prototype implementation):

    
    Lib.$WTInt [InlPrag=INLINE[0]] :: T Int
    [GblId[DataConWrapper],
     Caf=NoCafRefs,
     Str=m,
     Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
             WorkFree=True, Expandable=True,
             Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)
             Tmpl= Lib.TInt @ Int @~ (<int>_N :: Int GHC.Prim.~# Int)}]
    Lib.$WTInt = Lib.TInt @ Int @~ (<int>_N :: Int GHC.Prim.~# Int)
    </int></int>

    which is illegal with the current unlifted top-level binding policy. In the implementation it just seems to work, but it's hard to test because the wrapper will be inlined immediately. It could potentially break because the binding might not be properly tagged. My gut says this is highly unlikely because we can always see the constructor it evaluates to, but I'm not an authority on how pointer tagging for top-level data is implemented.

    Also note that this is purely an implementation detail: Haskell-the-language doesn't have the concept of data con wrappers and thus doesn't need the very limited form of unlifted top-level bindings we have here.

    Edit: Indeed, the above program fails CoreLint:

    
    warning:
        Recursive or top-level binder has strict demand info: $WTInt
        Binder's demand info: <l>
        In the RHS of $WTInt :: T Int
        Substitution: [TCvSubst
                         In scope: InScope {$tc'TInt $trModule $tc'TInt2 $tc'TInt1 $tc'TInt3
                                            $tcT $tcT2 $tcT1 $tcT3 $trModule3 $trModule1 $trModule2
                                            $trModule4 $WTInt $krep_run $krep1_ruC $krep2_ruD}
                         Type env: []
                         Co env: []]
    </l>

    kind #

    Oh, right. Will get rid of it.

    This potentially steals syntax, in case someone has a datatype context that begins with a type variable named unlifted. No one has ever done this, I'm sure. But there may be a small parsing hiccup.

    I thought I already had written something in that regard, but apparently I only did so in the implementation. You are right, we're turning a well-formedness error into a parsing error. It's the same situation as with instance.

    I'm a bit bothered by the fact that -XUnliftedNewtypes uses a kind annotation (optionally) to control the kind of the type, while this proposal uses a new keyword.

    That's a discussion I'm willing to have, but I'm leaning towards the keyword a little bit. It's also a lot less verbose.

    If we are going to have data unlifted, I'd prefer to also have data lifted.

    Me too. I'll amend the proposal and wait for someone to contest.

    点赞 评论 复制链接分享
  • weixin_39960700 weixin_39960700 4月前

    Annotations affecting runtime behavior has been with us from the beginning. If I write

    
    module Main where
    x = 9223372036854775807
    y = x + 1
    main = print y
    

    and run, I see 9223372036854775808.

    But if I add

    hs
    x :: Int
    

    and run, I get -9223372036854775808. I do believe that (short of trickery involving type families) this may be the first time a kind annotation affects runtime behavior, but that doesn't seem like much of a stretch.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    One thing that's really weird about having the unlifted keyword is described in the proposal as an interaction with -XUnliftedNewtypes:

    -XUnliftedNewtypes introduces unlifted newtypes, but does so simply by inferring the kind of its single constructor's field type, no unlifted needed. Now with the new unlifted keyword, we could potentially allow syntax like newtype unlifted Foo (a :: TYPE r) = Foo a. What are its semantics? Can we still have Foo Int#? That wouldn't exactly be UnliftedRep (which this proposal is all about), but the unlifted, unboxed runtime-rep IntRep. Similarly, do we allow Foo Int? That would be boxed and lifted, seemingly contradicting the declaration.

    So I suggest that we (somewhat ironically) disallow unlifted syntax for Newtype delcarations and instead suggest to activate -XUnliftedNewtypes, which will automatically infer the generalised kind.

    If we don't add the contextual keyword, we can just say that the return kind must evaluate to TYPE 'UnliftedRep. That may exclude H98 decls, but with -XStandaloneKindSignatures we can have the shorter syntax for H98 decls, too, right? That way we wouldn't even have to introduce new syntax.

    Otherwise, I like the keyword for its terseness.

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    I like the proposal. Indeed I have wanted this for a long time. Much of GHC's internals is ready for it; mostly we just need source-language syntax for it, which this proposal provides.

    Some thoughts:

    • The proposal allows lifted or unlifted data types. But how about levity polymorhism? Assuming the PointerRep proposal is implemented we might want to say
      
        data T (l :: Levity) :: TYPE (PtrRep l) where
           ...constructors...
        

    Then T Lifted would be lifted while T UnliftedRep would not be. I think that, again, most of GHC's internals would work fine with this. But even if we don't want this yet, it does call into question the fixed "lifted" and "unlifted" keywords.

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    Putting the proposal into "needs revision" state while the outstanding issues that arose during the committee review phase are addressed.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I have updated the proposal to look at the kind signature instead to determine levity.

    The WIP implementation is out of sync now, but that can be fixed at a later point, I think.

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    Any other outstanding issues, or are we ready to accept this?

    点赞 评论 复制链接分享
  • weixin_39753211 weixin_39753211 4月前

    I'll be a bit of a bore, but as proposal need to be self-contained, may I ask you to add the unlifted keyword story in the Alternatives section.

    I like to phrase this sort of subsection as “a previous version of this proposal used […] but it was decided against for [such and such]”.

    This way you can keep your possible grammar change for future documentation, but it is also documents the reason for the particular design choices without having to trawl through the Github conversation. Finally, as much as possible, the committee is supposed to take a decision based solely on the proposal document, it must therefore figure all discussed alternatives as accurately as possible.

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    I would love the proposal to talk about levity polymorphism. That is

    • Does it allow data T l :: TYPE (BoxedRep l) where ...?

    I suspect that the answer is "not in this proposal", but since the design has changed specifically to alow the possiblity of this future development, it would be good to discuss this future path explicitly.

    Also, wny not support it now?

    点赞 评论 复制链接分享
  • weixin_39960700 weixin_39960700 4月前

    A few more details from me:

    • The proposal should be (slightly) rephrased to be compatible with #301/#203, where we now say BoxedRep Lifted instead of LiftedRep.

    • The proposal should clarify that, even with -XUnliftedDatatypes, datatypes will be defaulted to have kind TYPE (BoxedRep Lifted). Indeed, the only way to get an unlifted datatype is via a kind signature (standalone or otherwise).

    • The sentence "The static semantics of other types of unlifted kind, such as the inability to declare them at the top-level, apply." seems to say the wrong thing. After beta-reduction, I interpret this to say "The inability to declare other types of unlifted kind at the top-level applies." I don't think this is what you mean.

    • The specification does not say whether nullary constructors of an unlifted type are allowed. This isn't me nitpicking -- I don't know the answer! (At least, not from the specification section; I have not reread the entire proposal.)

    • I second Simon's question about levity polymorphism. I think it should be easy to accommodate here.

    点赞 评论 复制链接分享
  • weixin_39994665 weixin_39994665 4月前

    can unlifted data be allocated directly into compact region?

    点赞 评论 复制链接分享
  • weixin_39960116 weixin_39960116 4月前

    Thanks!

    Implement the Strict data type only. Doing so provides the same semantics at the cost of more syntactic overhead.

    More syntactic overhead, but also more composable! I don’t need Bool and StrictBool and Maybe and StrictMaybe, I can use Strict Bool and Strict Maybe.

    At first glance, this makes me lean much more towards the Strict proposal.

    点赞 评论 复制链接分享
  • weixin_39960116 weixin_39960116 4月前

    Ah, but this allows you to implement Strict, so we can have our cake and eat it too. Compelling!

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Yes, the only real drawback compared to implementing Strict only would have been that this is a superset of the functionality #257 provides, so ideally should be implemented in a separate proposal. But having done a little prototyping, I think that we would have to make Force an actual DataCon anyway, so that it is properly handled by the simplifier and other passes. At this point adding the tiny bit of surface syntax is just a stone throw away, hence argued in favor of this proposal instead.

    点赞 评论 复制链接分享
  • weixin_39645019 weixin_39645019 4月前

    I assume that we want to UNPACK those strict fields, but I think this cross-kind UNPACK is slightly novel and should be explicitly spelled out in the proposal (and the other proposal too).

    点赞 评论 复制链接分享
  • weixin_39645019 weixin_39645019 4月前

    A bit off topic:

    I hope someday every data type can have an associate unboxed version connected by a type family (let's call it Unbox), and then UNPACK can have a type directed desugaring. Even better if we can do - LiftedBoxedFoo ~R LiftedHsPtr (Unbox LiftedBoxedFoo) - UnliftedBoxedBar ~R UnliftedHsPtr (Unbox UnliftedBoxedBar)

    Then Strict essentially gives us UnliftedHsPtr (Unbox LiftedBoxedFoo).

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    cross-kind UNPACK is slightly novel and should be explicitly spelled out in the proposal

    What do you mean by cross-kind? It's just unpacking the regular constructor fields like before, that's entirely orthogonal to the semantic concern of there being no bottom as an inhabitant of that constructor's type.

    Unless you mean the implicit unpacking of constructors like Force, that's an entirely different matter but with no consequences for perceived semantics, it's just an optimisation affecting operational semantics. Note that this doesn't UNPACK Forces argument, rather we want to coerce Strict a to a in Core like we do for newtypes, except that one way (a to Strict a) isn't actually zero-cost, but erase to a seq. See Simon's earlier comment and my remarks in https://gitlab.haskell.org/ghc/ghc/wikis/unlifted-data-types#proposal-b4-levity-polymorphic-functions.

    I hope someday every data type can have an associate unboxed version connected by a type family (let's call it Unbox)

    There are plans for having levity polymorphic (unlike the current meaning of levity polymorphism, which is more like representational/shape polymorphism) data types. See https://gitlab.haskell.org/ghc/ghc/wikis/unlifted-data-types#proposal-b3-levity-polymorphic-data-types for a design I would aim at.

    点赞 评论 复制链接分享
  • weixin_39774445 weixin_39774445 4月前

    Generally I like the idea here, and I don't think it'd be all that hard to implement. We already have unlifted data types like Array#; this just makes them user-definable.

    On syntax, a postfix kind signature is rather a quiet signal for such an important change. I'd suggest using a prefix keyword:

    
    data unlifted SPair = SP Int INt
    

    (This "keyword" would have significance only after data so it wouldn't prevent you using "unlifted" as, say, a variable name.

    点赞 评论 复制链接分享
  • weixin_39645019 weixin_39645019 4月前

    Note that this doesn't UNPACK Forces argument, rather we want to coerce Strict a to a in Core like we do for newtypes, except that one way (a to Strict a) isn't actually zero-cost, but erase to a seq.

    Well I'd want

    haskell
    data Strict :: Type -> TYPE 'UnliftedRep where
      Strict :: {-# UNPACK #-} !a -> Strict a
    
    haskell
    data UPairStrict :: Type -> Type -> TYPE 'UnliftedRep where
      UPairStrict :: {-# UNPACK #-} !a -> {-# UNPACK #-} !b -> SPair a b
    

    to work somewhat similarly. The second cannot be conceived of as a one-way coercion.

    But see how https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0098-unlifted-newtypes.rst#id4 points the way to levity-polymorphic newtypes. I think the kind signature is more in tune with that.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    If I understand correctly, then I don't think what you want to have is possible. You can't unpack polymorphic fields, at least not without specialisation of the data type, at that's certainly out of scope for this proposal. To exemplify, what is the static offset of the b field in UPairStrict? If we'd unpack the a field, then it depends on the size of the representation of a, which means you can't use UPairStrict in code polymorphic over a.

    UNPACK concerns the fields of a constructor (Force's, in particular). After unpacking, the heap object still has its own header. I was rather aiming at making the constructor Force have no representation at runtime at all, because its only field is conveniently of the exact same representation as the constructor itself (i.e. a pointer to the heap).

    This is probably most similar in spirit to turning a lifted data type with a single strict boxed field (So as Strict above, but with regular 'LiftedRep) into a newtype, except that that would be broken semantics because a pattern match on the constructor actually evaluates the field (as opposed to newtypes).

    点赞 评论 复制链接分享
  • weixin_39754411 weixin_39754411 4月前

    The proposal mentions that we could have an unlifted Ptr type. But that wouldn't be a drop-in replacement for the current Ptr type, because it has an unlifted kind and therefore can't be passed to polymorphic functions or stored in polymorphic data structures, right? e.g. you can't have a [Ptr a] when Ptr is unlifted. That should probably be mentioned as a drawback.

    (aside from this I'm strongly in favour of the proposal)

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Yes, currently we have no means of encoding the necessary levity polymorphism. See https://gitlab.haskell.org/ghc/ghc/wikis/unlifted-data-types#proposal-b3-levity-polymorphic-data-types in that regard. I think levity polymorphism (i.e. lifted vs. unlifted, but still boxed pointer to GC'd heap, as opposed to the current interpretation, which is more like representation/shape polymorphism) should be discussed in a subsequent proposal.

    Or maybe it should be part of this proposal, I've no idea what the guidelines are wrt. incrementality of GHC proposals. But I'd rather implement it in separate steps.

    点赞 评论 复制链接分享
  • weixin_39645019 weixin_39645019 4月前

    Right, I completely forgot about in normal cases need a concrete constructor to unpack, so no polymorphism there.

    But perhaps we do anything special with data AlmostNewtype a = AlmostNewtype !a today? By code gen, whether matching on AlmostNewtype forces anything should be irrelevant, and the same unboxing opportunities should be available. Perhaps we should, so the Strict unpacking is less of a one-off optimization.

    Back to my original, question I assume plain:

    haskell
    data Strict :: TYPE 'UnliftedRep where
      Strict :: {-# UNPACK #-} !Concrete -> Strict a
    
    haskell
    data UPairStrict :: TYPE 'UnliftedRep where
      UPairStrict :: {-# UNPACK #-} !Concrete0 -> {-# UNPACK #-} !Concrete1 -> SPair a b
    

    works fine?

    I do really really want "template" / "monomorphized" polymorphism, but I think should be separate proposal. There is much to decide about that alone.

    点赞 评论 复制链接分享
  • weixin_39872123 weixin_39872123 4月前

    Not knowing about this proposal I just added #268. I noticed that we already have support for unlifted types with just one constructor via newtype and that it can be extended to more types with possible compiler simplification by implementing case analysis in two ways, one for data and one for newtype instead of mere coercing (I think I read that's what you do).

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Coercing between unlifted and lifted rep are not actually essential to the proposal. Levity polymorphic coercions might be interesting to have in Core to support a zero-cost Strict data type, but that's irrelevant to Haskell as a language. It might become relevant later on when we want to have levity polymorphism (in the sense of lifted vs. unlifted, not boxed vs. unboxed) in the surface language.

    I don't see how implementing case analysis in two ways leads to compiler simplification. Note that case already works on terms of unlifted rep, so my proposal wouldn't actually imply a change to semantics at all. Everything is already in place! We just offer the needed vocabulary to the programmer to define her own unlifted data types.

    点赞 评论 复制链接分享
  • weixin_39872123 weixin_39872123 4月前

    What are the semantics of Strict?

    On Wed, 4 Sep 2019, 14:53 Sebastian Graf, wrote:

    Coercing between unlifted and lifted rep are not actually essential to the proposal. Levity polymorphic coercions might be interesting to have in Core to support a zero-cost Strict data type, but that's irrelevant to Haskell as a language. It might become relevant later on when we want to have levity polymorphism (in the sense of lifted vs. unlifted, not boxed vs. unboxed) in the surface language.

    I don't see how implementing case analysis in two ways leads to compiler simplification. Note that case already works on terms of unlifted rep, so my proposal wouldn't actually imply a change to semantics at all. Everything is already in place! We just offer the needed vocabulary to the programmer to define her own unlifted data types.

    — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ghc-proposals/ghc-proposals/pull/265?email_source=notifications&email_token=AHZ4ZZLHIAIPYR2NNVKZDDLQH64UDA5CNFSM4IPJ5H62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD53UOBI#issuecomment-527910661, or mute the thread https://github.com/notifications/unsubscribe-auth/AHZ4ZZJKYQDCMVQOPKBN5LLQH64UDANCNFSM4IPJ5H6Q .

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    By code gen, whether matching on AlmostNewtype forces anything should be irrelevant, and the same unboxing opportunities should be available. Perhaps we should, so the Strict unpacking is less of a one-off optimization.

    Hmm. I felt a strong urge to disagree but couldn't come up with an example. So perhaps eliminating indirections for types like AlmostNewtype (and Strict of course) is indeed a sound transformation, if a little surprising for debugging purposes.

    Back to my original, question I assume plain:

    haskell
    data Strict :: TYPE 'UnliftedRep where
      Strict :: {-# UNPACK #-} !Concrete -> Strict a
    

    haskell
    data UPairStrict :: TYPE 'UnliftedRep where
      UPairStrict :: {-# UNPACK #-} !Concrete0 -> {-# UNPACK #-} !Concrete1 -> SPair a b
    

    works fine?

    Yes, it should just unbox the fields, as today.

    I do really really want "template" / "monomorphized" polymorphism, but I think should be separate proposal. There is much to decide about that alone.

    The best bet is probably a backpack inspired solution. See unpacked-containers for example.

    点赞 评论 复制链接分享
  • weixin_39872123 weixin_39872123 4月前

    can I pattern match on x : Strict (GHC.Tuple.Unit a)?

    case x of {Force (Unit y) = y}
    

    what is undefined :: Strict (GHC.Tuple.Unit a) ? is it still _|_ - or Force _|_?

    How do you force types with two constructors? The attempt to force will be _|_ won't it, meaning the type isn't lifted.

    点赞 评论 复制链接分享
  • weixin_39872123 weixin_39872123 4月前

    Oh I see you can pattern match as above, the tree example is missing parentheses but I'm still curious about the matter of _|_ and undefined

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    I updated the Change Specification to be more formal. There's an open TODO discussion concerning syntax, so I expect this thread to be on fire very soon 🔥

    点赞 评论 复制链接分享
  • weixin_39835158 weixin_39835158 4月前

    Great proposal! I wish it at least sketched the path to the zero-cost Strict data type.

    The worker-wrapper example is what motivates me write this. Here's the status quo example from the proposal.

    
    foo :: Int -> SPair Int Int
    foo x = case $wfoo x of (# a, b #) -> SPair a b
    
    $wfoo :: Int -> (# Int, Int #)
    $wfoo x
      | even x
      = (# (x + 1), x #)
      | otherwise
      = case $wfoo (x-1) of
          (# a, b #) -> (# a+1, b+1 #)
    

    To me it's a bit surprising that the type of $wfoo is not Int# -> (# Int#, Int# #) (maybe I'm overlooking something trivial). I'd like to see worker-wrapper transform that produces the second type signature, and I'm assuming it will happen at some point. And it would be a shame if Strict would get in the way of it because it cannot accept unboxed arguments.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    Great proposal! I wish it at least sketched the path to the zero-cost Strict data type.

    Thanks! I think zero-cost Strict can be an implementation detail (without which w/w to it makes no sense, though).

    To me it's a bit surprising that the type of $wfoo is not Int# -> (# Int#, Int# #)

    What you want is Nested CPR (WIP MR). It's not entirely trivial to unbox the inner Ints.

    点赞 评论 复制链接分享
  • weixin_39595302 weixin_39595302 4月前

    I really like the meat of the proposal, but I'm not convinced that all the applications are accurate. I do think that we should be able to write data Foo :: TYPE 'UnliftedRep (or something like that). It's a clear win. I need this all the time.

    However, here are two of the applications that I have qualms with. One is the example with Ptr. UnliftedNewtypes will be in GHC 8.10. There's a section in the UnliftedNewtypes proposal that talks about an unlifted Ptr type as an application. Specifically, you will be able to write:

    newtype Ptr# :: Type -> TYPE 'AddrRep where
      Ptr# :: Addr# -> Ptr# a
    

    The allocation-related guarantees provided by this type are strictly better than the guarantees provided by the type suggested in this proposal:

    data Ptr a :: TYPE 'UnliftedRep where
      Ptr :: !Addr# -> Ptr a -- also there shouldn't be a need for the bang here
    

    In particular, Ptr# leads to data constructor allocations while Ptr does in situations where worker-wrapper doesn't work (like when you pass a function as an argument to a function that doesn't inline). So I don't find that particular use case motivating.

    The other problem I have is with the Strict data type. I'm troubled by the idea of turning a value strict by wrapping it in another data constructor. I know that the bang inside it means that the argument gets forced everywhere. It's just strange to me that it accomplishes strictness by using a mechanism that lives outside of the type system (bangs on data constructor fields), which inserts seq everywhere the constructor is applied. Similar to the Ptr example, this also doesn't work correctly with higher-order functions that don't inline. In fact, it would leads to more allocations than you would have seen with the lazy type.

    I have a half-baked proposal that puts strictness in kinds instead and doesn't suffer from this problem, but it's more complicated, changes the type of seq, and would introduce coercions all over the place in GHC Core, since data types would all be desugared to newtypes over the real under-the-hood strictness-polymorphic data type.

    Ultimately, I don't think that dual-purpose types that communicate both strictness and laziness is really that big of a deal though. I just don't really need thunks very often. With most types I deal with, I want values to be strict all of the time, and as it stands, this proposal gives me a way to get that guarantee. So, +1.

    点赞 评论 复制链接分享
  • weixin_39645019 weixin_39645019 4月前

    The optimization that works to unbox

    haskell 
    data AlmostNewtype a = MkAlmostNewtype !a
    data AlmostNewtype' a :: Type 'UnliftedRep = MkAlmostNewtype' !a
    

    I think is a good one, even if Strict is replaced with something a bit more straightforward like your forget and seqTyped (which, BTW, I don't see why we couldn't use LiftedRep and UnliftedRep if we someday retrofit existing data definitions to be more polymorphic). As such, I don't really mind Strict in the meantime as it's not actually a hack for a single nominal type. Ultimately we'll want https://github.com/ghc-proposals/ghc-proposals/issues/198 to clean all this stuff up.

    点赞 评论 复制链接分享
  • weixin_39622521 weixin_39622521 4月前

    However, here are two of the applications that I have qualms with. One is the example with Ptr.

    In particular, Ptr# leads to data constructor allocations while Ptr does in situations where worker-wrapper doesn't work (like when you pass a function as an argument to a function that doesn't inline). So I don't find that particular use case motivating.

    Yes, your observation is spot on. There's no compelling reason to choose Ptr over Ptr#, other than that Ptr is of a different kind. But that also means we can't get rid of the indirection in higher-order scenarios. I think I'll just delete that example.

    The other problem I have is with the Strict data type. I'm troubled by the idea of turning a value strict by wrapping it in another data constructor. I know that the bang inside it means that the argument gets forced everywhere. It's just strange to me that it accomplishes strictness by using a mechanism that lives outside of the type system (bangs on data constructor fields), which inserts seq everywhere the constructor is applied.

    Right. One could argue that we should rather formulate bang patterns in terms of Strict, but that horse is out the barn. Semantically, Strict is exactly what we want, but operationally we really don't want it to be a data constructor. Good news is that this part could entirely be a follow-up proposal.

    Similar to the Ptr example, this also doesn't work correctly with higher-order functions that don't inline. In fact, it would leads to more allocations than you would have seen with the lazy type.

    Yes. In a levity polymorphic world, we might want to call dataToTag# on a Strict thing in a polymorphic scenario, in which case we can't elide the data con:

    haskell
    f :: (forall a :: TYPE 'UnliftedRep). a -> Int#
    f x = dataToTag# x
    

    If we call this on Force True, I'd expect to get back 0#, but with the optimisation that tries to remove the indirection we would get 1#. Bugger. So we can't do the optimisation if the Strict escapes into a polymorphic context.

    It really seems like we need a coercion-inspired approach (like in your proposal or in https://gitlab.haskell.org/ghc/ghc/wikis/unlifted-data-types#proposal-4-allow-unlifting-existing-data-types-with-no-overhead) to solve this.

    I have a half-baked proposal that puts strictness in kinds instead and doesn't suffer from this problem, but it's more complicated, changes the type of seq, and would introduce coercions all over the place in GHC Core, since data types would all be desugared to newtypes over the real under-the-hood strictness-polymorphic data type.

    I left some comments there.

    Ultimately, I don't think that dual-purpose types that communicate both strictness and laziness is really that big of a deal though. I just don't really need thunks very often. With most types I deal with, I want values to be strict all of the time, and as it stands, this proposal gives me a way to get that guarantee. So, +1.

    Yes, I think this proposal can be largely orthogonal to what you sketch out over in yours. This proposal doesn't try to tackle levity polymorphism just yet, it's just some mostly uncontradictory ground-work that needs to be done first, together with some ideas of how to use it.

    点赞 评论 复制链接分享

相关推荐