Hacker News new | past | comments | ask | show | jobs | submit login

In Haskell I think it's something like:

    {-# LANGUAGE OverloadedStrings #-}
    import Data.Attoparsec.Text
    import qualified Data.Text as T

    type ParseError = String

    csgParse :: T.Text -> Either ParseError Int
    csgParse = eitherResult . parse parser where
    parser = do
        as <- many' $ char 'a'
        let n = length as
        count n $ char 'b'
        count n $ char 'c'
        char '\n'
        return n

    ghci> csgParse "aaabbbccc\n"
    Right 3



The question comes from Haskell, yes: https://byorgey.wordpress.com/2012/01/05/parsing-context-sen...

You used monadic parser, monadic parsers are known to be able to parse context-sensitive grammars. But, they hide the fact that they are combiinators, implemented with closures beneath them. For example, that "count n $ char 'b'" can be as complex as parsing a set of statements containing expressions with an operator specified (symbol, fixity, precedence) earlier in code.

In Haskell, it is easy - parameterize your expression grammar with operators, apply them, parse text. This will work even with Applicative parsers, even unextended.

But in Rust? I haven't seen how it can be done.


I tried winnow as suggested elsewhere in this thread, this is what it may look like:

    use winnow::combinator::{ repeat };
    use winnow::token::take_while;
    use winnow::prelude::*;

    pub fn csg_parse(input: &mut &str) -> PResult<usize> {
        let a: &str = take_while(0.., 'a').parse_next(input)?;
        let n: usize = a.len();
        repeat(n, "b").parse_next(input)?;
        repeat(n, "c").parse_next(input)?;
        '\n'.parse_next(input)?;
        return Ok(n);
    }

    #[cfg(test)]
    mod test {
        use super::*;

        #[test]
        fn parses_examples() {
            assert_eq!(Ok(0), csg_parse(&mut "\n"));
            assert_eq!(Ok(3), csg_parse(&mut "aaabbbccc\n"));

            assert!(csg_parse(&mut "abcc\n").is_err());
            assert!(csg_parse(&mut "abbcc\n").is_err());
            assert!(csg_parse(&mut "aabc\n").is_err());
            assert!(csg_parse(&mut "aabbc\n").is_err());
            assert!(csg_parse(&mut "def\n").is_err());
        }
    }


So you are directly express context-sensitivity instead of approximating it with infinite context-free grammar.

Okay, good enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: