commit d0ee6f8864c1066fb14672faeb738ff511abc261 Author: Jason Garrett-Glaser Date: Thu Aug 20 13:08:25 2009 -0700 Fix bug in calculation of I-frame costs with AQ. encoder/slicetype.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) commit f6bf719c67c109635eb001ff2b48c24d97ff5f0d Author: David Conrad Date: Wed Aug 19 17:03:02 2009 -0700 GSOC merge part 1: Framework for ARM assembly optimizations x264 will detect which ARM core it's building for and only build NEON asm if the target is ARMv6 or above, then enable NEON at runtime. Makefile | 12 ++++++++ common/cpu.c | 79 +++++++++++++++++++++++++++++++++++++++++++----------- common/osdep.h | 7 +++++ configure | 15 +++++++++- tools/checkasm.c | 32 +++++++++++++++------ x264.h | 3 ++ 6 files changed, 121 insertions(+), 27 deletions(-) commit 333100756c0c27c5a71119a3f80abc80d439332a Author: David Conrad Date: Wed Aug 19 16:18:36 2009 -0700 Fix a bug in checkasm and two OSX fixes MC chroma checkasm test could crash in some situations Remove -lmx, as it's not needed and the iPhone doesn't have it. Remove unused sqrtf emulation; it breaks if math.h is included. common/osdep.h | 3 --- configure | 2 +- tools/checkasm.c | 2 +- 3 files changed, 2 insertions(+), 5 deletions(-) commit 441c36cbc814ac13cc565ce5e6566bcbf5f6d6e2 Author: Jason Garrett-Glaser Date: Wed Aug 19 01:49:47 2009 -0700 Improve QPRD Always check the last macroblock's QP, even if the normal search doesn't reach it. Raise the failure threshold when moving towards the last macroblock's QP. 0.2-1% improved compression. encoder/analyse.c | 30 +++++++++++++++++++++++------- 1 files changed, 23 insertions(+), 7 deletions(-) commit d345ab05391e2b32c0d567aaa9945f63ce892016 Author: Jason Garrett-Glaser Date: Tue Aug 18 21:53:28 2009 -0700 Fix MB-tree with keyint<3 Also slightly improve VBV keyint handling. encoder/encoder.c | 2 +- encoder/slicetype.c | 12 +++++------- 2 files changed, 6 insertions(+), 8 deletions(-) commit 5e9ae4cb8de4a95a567c6dbfdbaaf8138e695589 Author: Jason Garrett-Glaser Date: Tue Aug 18 19:25:45 2009 -0700 Fix bug in VBV lookahead + no MB-tree I-frames need to have VBV lookahead run on them as well. encoder/slicetype.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) commit d6846d3d17ac7289eb3d82f54d19459d3a1a5b21 Author: Jason Garrett-Glaser Date: Tue Aug 18 18:37:26 2009 -0700 Add support for frame-accurate parameter changes Parameter structs can now be passed with individual frames. The previous method would only change the parameter of what was currently being encoded, which due to delay might be very far from an intended exact frame. Also add support for changing aspect ratio. Only works in a stream with repeating headers and requires the caller to force an IDR to ensure instant effect. common/common.c | 1 + common/frame.c | 1 + common/frame.h | 2 + encoder/encoder.c | 73 +++++++++++++++++++++++++++++++------------------ encoder/ratecontrol.c | 8 +++--- x264.h | 21 ++++++++++++-- 6 files changed, 72 insertions(+), 34 deletions(-) commit 9892984392cd550e65e3f2cd6fd08d6afb083472 Author: Jason Garrett-Glaser Date: Tue Aug 18 15:46:26 2009 -0700 Fix x264_encoder_reconfig with multithreading New behavior: reconfigging the encoder will result in changes being applied to each of the encoding threads as they finish encoding the current frame. encoder/encoder.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) commit c07d292d7b5298327d835d0c458083de29c23204 Author: Jason Garrett-Glaser Date: Sun Aug 16 03:29:49 2009 -0700 Fix two bugs in QPRD QPRD could in some cases force blocks to skip when they shouldn't be ~(+0.01db) Force QPRD to abide by qpmin/qpmax restrictions. encoder/analyse.c | 9 +++++---- encoder/rdo.c | 4 ++-- 2 files changed, 7 insertions(+), 6 deletions(-) commit 1b6086ceb0c346c959a68478b9c36284fbdd5872 Author: Jason Garrett-Glaser Date: Sat Aug 15 19:02:31 2009 -0700 Lookahead VBV Use the large-scale lookahead capability introduced in MB-tree for ratecontrol purposes. (Does not require MB-tree, however.) Greatly improved quality and compliance in 1-pass VBV mode, especially in CBR; +2db OPSNR or more in some cases. Fix some other bugs in VBV, which should improve non-lookahead mode as well. Change the tolerance algorithm in row VBV to allow for more significant mispredictions when buffer is nearly full. Note that due to the fixing of an extremely long-standing bug (>1 year), bitrates may change by nontrivial amounts in CRF without MB-tree. common/common.c | 2 +- common/frame.c | 2 + common/frame.h | 5 ++ encoder/encoder.c | 13 ++++- encoder/ratecontrol.c | 116 ++++++++++++++++++++++++++++++----------- encoder/slicetype.c | 137 ++++++++++++++++++++++++++++++++++++------------- 6 files changed, 206 insertions(+), 69 deletions(-) commit 1febc6bd5dc4f625049273f4d86472ed770e3882 Author: Jason Garrett-Glaser Date: Fri Aug 14 07:20:07 2009 -0700 Fix bug in b-adapt 1 B-adapt 1 didn't use more than MAX(1,bframes-1) B-frames when MB-tree was off. encoder/slicetype.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) commit 8236cc979e1b4d8aae1aecbaae28c3017e375a6f Author: Jason Garrett-Glaser Date: Thu Aug 13 17:13:33 2009 -0700 Fix a potential failure in VBV If VBV does underflow, ratecontrol could be permanently broken for the rest of the clip. Revert part of the previous VBV changes to fix this. encoder/ratecontrol.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) commit 42d6b17b439fc90793d19788e7b314e2517c1fc2 Author: Anton Mitrofanov Date: Thu Aug 13 21:40:21 2009 +0000 new API function x264_encoder_delayed_frames. fix x264cli on streams whose total length is less than the encoder latency. encoder/encoder.c | 17 +++++++++++++++++ x264.c | 7 ++++--- x264.h | 6 +++++- 3 files changed, 26 insertions(+), 4 deletions(-) commit 62df17ec0523474c8f03e28fb982de1722497288 Author: Jason Garrett-Glaser Date: Thu Aug 13 14:12:26 2009 -0700 Add no-mbtree to fprofile (and fix pyramid in fprofile) Makefile | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) commit 7f8aa6fa1819a62196c1953d85fa6e3d504e39b7 Author: Jason Garrett-Glaser Date: Sun Aug 9 16:06:52 2009 -0700 Don't print a warning about direct=auto in 2pass when B-frames are off encoder/encoder.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) commit 8eb6720e89f441056948818558113f29b3a282c6 Author: Loren Merritt Date: Thu Aug 13 05:02:59 2009 +0000 fix lowres padding, which failed to extrapolate the right side for some resolutions. fix a buffer overread in x264_mbtree_propagate_cost_sse2. no effect on actual behavior, only theoretical correctness. fix x264_slicetype_frame_cost_recalculate on I-frames, which previously used all 0 mb costs. shut up a valgrind warning in predict_8x8_filter_mmx. common/frame.c | 16 ++++++++-------- common/macroblock.c | 3 ++- common/x86/mc-a2.asm | 8 ++++---- encoder/slicetype.c | 2 +- 4 files changed, 15 insertions(+), 14 deletions(-) commit 9a54c483f5f12aa5614ba0a4a4cfaee19377047e Author: Loren Merritt Date: Sun Aug 9 04:00:36 2009 +0000 simd part of x264_macroblock_tree_propagate. 1.6x faster on conroe. common/macroblock.c | 3 ++- common/mc.c | 29 +++++++++++++++++++++++++++++ common/mc.h | 3 +++ common/x86/mc-a2.asm | 41 +++++++++++++++++++++++++++++++++++++++++ common/x86/mc-c.c | 3 +++ encoder/encoder.c | 1 + encoder/slicetype.c | 13 ++++++------- tools/checkasm.c | 26 ++++++++++++++++++++++++++ 8 files changed, 111 insertions(+), 8 deletions(-) commit 886d1e9878a6f2424bd005a9cb16843ca8e8d1df Author: Loren Merritt Date: Sat Aug 8 14:53:27 2009 +0000 MB-tree fixes: AQ was applied inconsistently, with some AQed costs compared to other non-AQed costs. Strangely enough, fixing this increases SSIM on some sources but decreases it on others. More investigation needed. Account for weighted bipred. Reduce memory, increase precision, simplify, and early terminate. common/frame.c | 2 +- common/frame.h | 2 +- encoder/slicetype.c | 65 ++++++++++++++++++++++---------------------------- 3 files changed, 31 insertions(+), 38 deletions(-) commit 3b047a2a7d55d613c6ae49da22c7f30d02a048dc Author: Jason Garrett-Glaser Date: Sat Aug 8 17:51:01 2009 -0700 Add missing free()s for new data allocated for MB-tree Eliminates a memory leak. common/frame.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) commit 6f4054f79d3ff1034d634aabb9f9d9c8eb261bad Author: Jason Garrett-Glaser Date: Sat Aug 8 12:53:06 2009 -0700 Fix keyframe insertion with MB-tree and no B-frames encoder/slicetype.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) commit 21a38355ccd46d1dd928341f8b9f6dd69afd9cff Author: Jason Garrett-Glaser Date: Sat Aug 8 11:26:36 2009 -0700 Fix MP4 output (bug in malloc checking patch) muxers.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) commit 7139b088129707b0daf3fd3bae9647f649deb1ba Author: Steven Walters Date: Fri Aug 7 16:18:01 2009 -0700 Gracefully terminate in the case of a malloc failure Fuzz tests show that all mallocs appear to be checked correctly now. common/common.c | 64 +++++++++++----------------- common/common.h | 13 +++--- common/frame.c | 20 ++++---- common/osdep.h | 18 +++++--- common/set.c | 11 +++-- common/visualize.c | 7 ++- common/visualize.h | 2 +- encoder/analyse.c | 22 ++++++--- encoder/analyse.h | 3 +- encoder/encoder.c | 114 +++++++++++++++++++++++++++++-------------------- encoder/ratecontrol.c | 69 ++++++++++++++++++------------ encoder/set.c | 12 ++++- encoder/set.h | 2 +- encoder/slicetype.c | 12 ++++- matroska.c | 5 +- muxers.c | 28 ++++++++++-- tools/checkasm.c | 5 ++ x264.c | 28 +++++++++--- x264.h | 7 ++- 19 files changed, 271 insertions(+), 171 deletions(-) commit 71506ae7a307545b530663ecab00b06fc424ca48 Author: Anton Mitrofanov Date: Fri Aug 7 10:44:13 2009 -0700 Fix a potential infinite loop in QPfile parsing on Windows ftell doesn't seem to work properly on Windows in text mode. x264.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) commit cb0b71fb981a3c6020d628dc7a41855b81df8d54 Author: Jason Garrett-Glaser Date: Fri Aug 7 10:31:16 2009 -0700 Fix delay calculation with multiple threads Delay frames for threading don't actually count as part of lookahead. encoder/encoder.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) commit a1ed468f67476fbbe49e1fbfe1a567be0c052d44 Author: Jason Garrett-Glaser Date: Thu Aug 6 23:09:46 2009 -0700 Add "veryslow" preset Apparently some people are actually *using* placebo, so I've added this preset to bridge the gap. x264.c | 17 +++++++++++++++-- 1 files changed, 15 insertions(+), 2 deletions(-) commit bb66c482242a0747823661b212114c1a2f015fe3 Author: Jason Garrett-Glaser Date: Tue Aug 4 17:46:33 2009 -0700 Macroblock-tree ratecontrol On by default; can be turned off with --no-mbtree. Uses a large lookahead to track temporal propagation of data and weight quality accordingly. Requires a very large separate statsfile (2 bytes per macroblock) in multi-pass mode. Doesn't work with b-pyramid yet. Note that MB-tree inherently measures quality different from the standard qcomp method, so bitrates produced by CRF may change somewhat. This makes the "medium" preset a bit slower. Accordingly, make "fast" slower as well, and introduce a new preset "faster" between "fast" and "veryfast". All presets "fast" and above will have MB-tree on. Add a new option, --rc-lookahead, to control the distance MB tree looks ahead to perform propagation analysis. Default is 40; larger values will be slower and require more memory but give more accurate results. This value will be used in the future to control ratecontrol lookahead (VBV). Add a new option, --no-psy, to disable all psy optimizations that don't improve PSNR or SSIM. This disables psy-RD/trellis, but also other more subtle internal psy optimizations that can't be controlled directly via external parameters. Quality improvement from MB-tree is about 2-70% depending on content. Strength of MB-tree adjustments can be tweaked using qcompress; higher values mean lower MB-tree strength. Note that MB-tree may perform slightly suboptimally on fades; this will be fixed by weighted prediction, which is coming soon. common/common.c | 22 ++- common/common.h | 50 ++++++- common/frame.c | 10 +- common/frame.h | 3 + common/osdep.h | 9 +- encoder/analyse.c | 4 +- encoder/encoder.c | 56 ++++++- encoder/ratecontrol.c | 203 +++++++++++++++++------- encoder/ratecontrol.h | 3 +- encoder/slicetype.c | 424 ++++++++++++++++++++++++++++++++++++++----------- x264.c | 34 +++- x264.h | 5 +- 12 files changed, 642 insertions(+), 181 deletions(-) commit f21e71a04ba65aff9b5a4bfa8a73fd86c463f4ee Author: Jason Garrett-Glaser Date: Mon Aug 3 20:52:30 2009 -0700 Various 1-pass VBV tweaks Make predictors have an offset in addition to a multiplier. This primarily fixes issues in sources with lots of extremely static scenes, such as anime and CGI. We tried linear regressions, but they were very unreliable as predictors. Also allow VBV to be slightly more aggressive in raising QPs to avoid not having enough bits left in some situations. Up to 1db improvement on some clips. encoder/ratecontrol.c | 32 +++++++++++++++++++++----------- 1 files changed, 21 insertions(+), 11 deletions(-) commit 5d75a9bd5b942392c4ab64156a266eed64c0793f Author: Jason Garrett-Glaser Date: Tue Jul 28 20:41:27 2009 -0700 Fix another 10L in QPRD An entry in subpel_iterations was missing. I have no idea how QPRD was working at all without this change. encoder/me.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) commit 97ed27054005a85e7c49209c20fd0b280917ac02 Author: Jason Garrett-Glaser Date: Tue Jul 28 01:16:23 2009 -0700 Update help and cleanup in ratecontrol.c Deal with some out-of-date information. encoder/ratecontrol.c | 10 +--------- x264.c | 4 ++-- 2 files changed, 3 insertions(+), 11 deletions(-) commit 0538c56a95cdc41a42a206079276a57d5d76b5a5 Author: Loren Merritt Date: Tue Jul 28 07:16:31 2009 +0000 15% faster refine_bidir_satd, 10% faster refine_bidir_rd (or less with trellis=2) re-roll a loop (saves 44KB code size, which is the cause of most of this speed gain) don't re-mc mvs that haven't changed encoder/me.c | 106 ++++++++++++++++++++++++++++++---------------------------- 1 files changed, 55 insertions(+), 51 deletions(-) commit 306c3ee4b1c3cae804185597305725d2484f21b9 Author: Jason Garrett-Glaser Date: Mon Jul 27 21:03:00 2009 -0700 Faster bidir_rd plus some bugfixes Cache chroma MC during refine_bidir_rd and use both the luma and chroma caches to skip MC in macroblock_encode. Fix incorrect call to rd_cost_part; refine_bidir_rd output was incorrect for i8>0. Remove some redundant clips. ~12% faster refine_bidir_rd. encoder/analyse.c | 38 +++++++++++++++++++------------------- encoder/macroblock.c | 3 ++- encoder/me.c | 47 +++++++++++++++++++++++++++++++++++++---------- 3 files changed, 58 insertions(+), 30 deletions(-) commit d6eed014d0af8f87045d6d5daf3376c486efdea7 Author: Jason Garrett-Glaser Date: Mon Jul 27 04:45:03 2009 -0700 Add "fastdecode" tune option It does what it says it does. x264.c | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) commit 43773d27a6dd74c62b6d29d0ae0a80397469bfbf Author: Jason Garrett-Glaser Date: Sun Jul 26 12:20:09 2009 -0700 Fix two bugs in QPRD fprofile settings now actually fprofile QPRD. Don't use i_mbrd before initializing it. Makefile | 2 +- encoder/analyse.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) commit 9d0b5e95bbf3f9077806927264add762952f77ad Author: Jason Garrett-Glaser Date: Sun Jul 26 03:03:12 2009 -0700 Fix 10l in QPRD Trellis used wrong lambda with trellis=1 encoder/analyse.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) commit 4074956df13e058421fb5ba89b872be143742ffd Author: Jason Garrett-Glaser Date: Sat Jul 25 22:31:06 2009 -0700 Fix a nondeterminism with threads and subme>7 Also add a few more checks to eliminate the need for spel_border. encoder/analyse.c | 5 ++--- encoder/me.c | 34 +++++++++++++--------------------- 2 files changed, 15 insertions(+), 24 deletions(-) commit 7733721e410acb96fdf740ca95d2a394b2a2b713 Author: Jason Garrett-Glaser Date: Thu Jul 23 12:20:39 2009 -0700 Add QPRD support as subme=10 Refactor trellis lambda selection to be done in analyse_init instead of in trellis. This will allow for more easy adaption of lambda later on; for now it allows constant lambda across variable QPs. QPRD is only available with adaptive quantization enabled and generally improves SSIM and visual quality. Additionally, weight the SSD values from RD based on the relative QP offset for chroma; helps visually at high QPs where chroma has a lower QP than luma. This fixes some visual artifacts created by QPRD at high QPs. Note that this generally hurts PSNR and SSIM, and so is only on when psy-RD is on. Makefile | 2 +- common/common.h | 5 ++ encoder/analyse.c | 129 +++++++++++++++++++++++++++++++++++++++++++++++-- encoder/encoder.c | 4 +- encoder/macroblock.c | 12 ++-- encoder/macroblock.h | 4 +- encoder/ratecontrol.c | 4 -- encoder/rdo.c | 51 +++++-------------- x264.c | 7 ++- 9 files changed, 160 insertions(+), 58 deletions(-) commit f5e6980b3eb34ed610f5fc36a4378a0ed4277753 Author: Jason Garrett-Glaser Date: Tue Jul 21 19:56:21 2009 -0700 SSSE3 cachesplit workaround for avg2_w16 Palignr-based solution for the most commonly used qpel function. 1-1.5% faster overall on Core 2 chips. common/x86/mc-a.asm | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++ common/x86/mc-c.c | 8 ++++++ 2 files changed, 68 insertions(+), 0 deletions(-) commit 29569051505a78db9dbbc8fda53ab11e7e08b994 Author: Loren Merritt Date: Wed Jul 22 20:20:52 2009 +0000 shut up valgrind warnings in trellis encoder/rdo.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) commit 88b35c2d3bd86b42059e27db365752da9f2cd032 Author: Anton Mitrofanov Date: Sat Jul 18 16:30:18 2009 -0700 New AQ algorithm option "Auto-variance" uses log(var)^2 instead of log(var) and attempts to adapt strength per-frame. Generates significantly better SSIM; on by default with --tune ssim. Whether it generates visually better quality is still up for debate. Available as --aq-mode 2. encoder/encoder.c | 2 +- encoder/me.c | 3 +-- encoder/ratecontrol.c | 46 ++++++++++++++++++++++++++++++++++++++-------- x264.c | 6 ++++-- x264.h | 1 + 5 files changed, 45 insertions(+), 13 deletions(-)