Haskell速度问题,即执行程序的两个部分所需的时间明显长于单独执行其中一部分所需的时间

xkftehaa  于 6个月前  发布在  其他
关注(0)|答案(1)|浏览(57)

我有一个Haskell程序,主要有两行代码:

putStrLn $ "Day11: part1: " ++ show (sum $ bigManhattan 1 galaxies <$> pairs)
putStrLn $ "Day11: part2: " ++ show (sum $ bigManhattan 999999 galaxies <$> pairs)

字符串
如果我注解掉其中任何一个,程序运行时间为0.01秒。如果它们都存在,程序需要90秒。
我想知道是否有人有什么想法?他们会不会在竞争数据,互相干扰?
更多的选择-我不知道这些都是做什么的...

ghc-options:
- -Wall
- -Wcompat
- -Widentities
- -Wincomplete-record-updates
- -Wincomplete-uni-patterns
- -Wmissing-export-lists
- -Wmissing-home-modules
- -Wpartial-fields
- -Wredundant-constraints
- -O2


代码:

module Day11(day11) where

import Data.List ((\\))
import Data.Maybe (catMaybes)

type Coord = (Int, Int)

manhattan :: Coord -> Coord -> Int
manhattan (x1, y1) (x2, y2) = abs (x1 - x2) + abs (y1 - y2)

getF :: (String -> a) -> Int -> IO a
getF f n = do
  s <- readFile $ "./Data/Day" ++ show n ++ ".in"
  return $ f s

getLines :: Int -> IO [String]
getLines = getF lines

parse :: [String] -> [Coord]
parse css = concatMap (catMaybes . (\(y, cs) -> (\(x, c) -> if c=='#' then Just (x,y) else Nothing) <$> zip [0..] cs)) (zip [0..] css)

nSize :: Int
nSize = 140

bigManhattan :: Int -> [Coord] -> (Coord, Coord) -> Int
bigManhattan k galaxies ((c1, r1), (c2, r2)) = manhattan (c1+newc1, r1+newr1) (c2+newc2, r2+newr2)
  where
    baseC, baseR :: [Int]
    baseC = [0..(nSize-1)] \\ (fst <$> galaxies)
    baseR = [0..(nSize-1)] \\  (snd <$> galaxies)
    newc1, newc2, newr1, newr2 :: Int
    newc1 = k * length (filter (c1>) baseC)
    newc2 = k * length (filter (c2>) baseC)
    newr1 = k * length (filter (r1>) baseR)
    newr2 = k * length (filter (r2>) baseR)

day11 :: IO ()
day11 = do
  ls <- getLines 11
  let galaxies = parse ls
      pairs = [(x,y) | x <- galaxies, y <- galaxies, x<y ]

  putStrLn $ "Day11: part1: " ++ show (sum $ bigManhattan 1 galaxies <$> pairs)
  --putStrLn $ "Day11: part2: " ++ show (sum $ bigManhattan 999999 galaxies <$> pairs)
  return ()


使用GHC 9.4.7.

3okqufwl

3okqufwl1#

请注意,baseCbaseR依赖于galaxies,但不依赖于pairs。当您只打印两个结果中的一个时,GHC内联所有bigManhattan,并能够将baseCbaseRsum中提取出来。但是,当您打印两个结果时,内联被抑制,pairs的每个元素上的每个bigManhattan调用都会重新计算baseCbaseR。解决这个问题的最简单方法是将summap移动到bigManhattan中,以确保baseCbaseR每次求和只计算一次。可以做得更好,因为它们也不依赖于k,但这似乎工作得足够快。

bigManhattan :: Int -> [Coord] -> [(Coord, Coord)] -> Int
bigManhattan k galaxies = sum . map go
  where
    baseC, baseR :: [Int]
    baseC = [0..(nSize-1)] \\ (fst <$> galaxies)
    baseR = [0..(nSize-1)] \\  (snd <$> galaxies)

    go ((c1,r1),(c2,r2)) = manhattan (c1+newc1, r1+newr1) (c2+newc2, r2+newr2)
      where newc1, newc2, newr1, newr2 :: Int
            newc1 = k * length (filter (c1>) baseC)
            newc2 = k * length (filter (c2>) baseC)
            newr1 = k * length (filter (r1>) baseR)
            newr2 = k * length (filter (r2>) baseR)

day11 :: IO ()
day11 = do
  ls <- getLines 11
  let galaxies = parse ls
      pairs = [(x,y) | x <- galaxies, y <- galaxies, x<y ]

  putStrLn $ "Day11: part1: " ++ show (bigManhattan 1 galaxies pairs)
  putStrLn $ "Day11: part2: " ++ show (bigManhattan 999999 galaxies pairs)
  return ()

字符串

相关问题