Modify

Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#5890 closed patch (fixed)

libmythsoundtouch: SIMD optimisation patch

Reported by: foobum@… Owned by: janne
Priority: minor Milestone: 0.22
Component: mythtv Version: head
Severity: medium Keywords: soundtouch SIMD
Cc: mark@…, kormoc Ticket locked: no

Description

The MMX optimised routines in libmythsoundtouch appear to be disabled since ALLOW_MMX is never defined. Also, the MMX routines were explicitly disabled for x86_64 since they give garbled output on that arch. This patch:

  • Defines ALLOW_MMX in soundtouch if (arch == x86 && defined(MMX))
  • Adds integer SSE2 routines for x86_64

This results in a huge speed increase - enough to make 1.2x possible with 1080i mpeg2 and 6ch AC-3 audio on a 2.2Ghz amd64 with cycles to spare.

Tested on x86_64 only. Patch applies against fixes and trunk.

Attachments (5)

soundtouch.patch (12.5 KB) - added by foobum@… 9 years ago.
soundtouch-update.patch (11.3 KB) - added by foobum@… 9 years ago.
soundtouch-update-1.patch (13.1 KB) - added by foobum@… 9 years ago.
soundtouch-update-2.patch (14.5 KB) - added by foobum@… 9 years ago.
st-fix.patch (989 bytes) - added by foobum@… 9 years ago.

Download all attachments as: .zip

Change History (26)

Changed 9 years ago by foobum@…

comment:1 follow-up: Changed 9 years ago by jppoet@…

This patch does greatly improve timestretch performance under 64bit. Thank you for that!

I did notice, however, that it causes DD5.1 to fall back to DD2.0.

comment:2 in reply to: ↑ 1 ; follow-up: Changed 9 years ago by foobum@…

Replying to jppoet@gmail.com:

I did notice, however, that it causes DD5.1 to fall back to DD2.0.

The patch doesn't cause this, timestretching 6 ch AC-3 has always given you 2 (reencoded) channels.

However, this patch will be followed by another that addresses this issue amongst a bunch of other things. Just tidying it up.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 9 years ago by jppoet@…

Replying to foobum@gmail.com:

Replying to jppoet@gmail.com:

I did notice, however, that it causes DD5.1 to fall back to DD2.0.

The patch doesn't cause this, timestretching 6 ch AC-3 has always given you 2 (reencoded) channels.

Actually, Mark Spieth added DD5.1 support 9 months ago in ticket #1104

However, this patch will be followed by another that addresses this issue amongst a bunch of other things. Just tidying it up.

Cool Thanks for your efforts.

comment:4 in reply to: ↑ 3 Changed 9 years ago by foobum@…

Replying to jppoet@gmail.com:

Actually, Mark Spieth added DD5.1 support 9 months ago in ticket #1104

Yep, but it downmixed DD5.1 to DD2.0 once timestretch was on. This is fixed in the next patch. Glad it's been useful to you.

comment:5 Changed 9 years ago by kormoc

  • Cc kormoc added

comment:6 follow-up: Changed 9 years ago by janne

  • Milestone changed from unknown to 0.22
  • Owner changed from ijr to janne
  • Status changed from new to assigned

how is the SSE asm related to sse_optimized.cpp in upstream soundtouch?

comment:7 in reply to: ↑ 6 Changed 9 years ago by foobum@…

Replying to janne:

how is the SSE asm related to sse_optimized.cpp in upstream soundtouch?

It's not.

comment:8 Changed 9 years ago by foobum@…

Think I missed the gist of your question. sse_optimized.cpp implements filters and calcCrossCorrStereo for float samples. The asm in the patch implements overlap and calcCrossCorr for stereo and multichannel and 16 bit int samples. The existing MMX filter code works fine so we use that for both x86 and x86_64.

comment:9 follow-up: Changed 9 years ago by janne

yes, I've seen that after looking at the code. Do you care enough to submit your asm to upstream soundtouch?

ALLOW_MMX is defined in STTypes.h, no need to define it in the project file. I also don't like the ARCH_X86_64 ifdefs. I've changed the preference of mmx and sse code in TDStretch.cpp

comment:10 Changed 9 years ago by janne

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [19024]) Add SSE optimized routines for soundtouch. Fixes #5890

Speedup for X86_64 since MMX SIMD routines unfortunately don't work there.

patch by: <foobum gmail com>

comment:11 in reply to: ↑ 9 Changed 9 years ago by foobum@…

Replying to janne:

yes, I've seen that after looking at the code. Do you care enough to submit your asm to upstream soundtouch?

Perhaps, although I haven't looked at whether their MMX suffers the same problem.

ALLOW_MMX is defined in STTypes.h, no need to define it in the project file. I also don't like the ARCH_X86_64 ifdefs. I've changed the preference of mmx and sse code in TDStretch.cpp

Ah.. Looks fine. Ta.

Changed 9 years ago by foobum@…

comment:12 Changed 9 years ago by foobum@…

An incremental patch that improves the asm, makes it x86 compatible and turns it on for x86 (where it should be noticeably faster than the mmx, unfortunately I can't test).

comment:13 Changed 9 years ago by janne

  • Resolution fixed deleted
  • Status changed from closed to new

comment:14 follow-up: Changed 9 years ago by janne

  • Status changed from new to assigned

Which SSE version is do you use that?

iirc soundtouch has no runtime CPU detection. you just can't assume that a MMX capable cpu has also SSE?

comment:15 in reply to: ↑ 14 Changed 9 years ago by foobum@…

Replying to janne:

SSE2 and no, I guess we shouldn't assume that.. I'll add runtime cpu detection and update patch tomorrow.

Changed 9 years ago by foobum@…

comment:16 Changed 9 years ago by foobum@…

Sorry, soundtouch does have runtime cpu detection. I have, however, renamed the define and the class SSE -> SSE2 and changed the rt cpu check to check explicitly for SSE2.

Changed 9 years ago by foobum@…

comment:17 Changed 9 years ago by foobum@…

Latest (-2) makes the asm more 'correct' - the SSE2 now produces output exactly equal to that of the C (unlike the MMX). Now works properly again when compiled with -O3 too (oops).

comment:18 Changed 9 years ago by janne

  • Status changed from assigned to accepted

comment:19 Changed 9 years ago by janne

  • Resolution set to fixed
  • Status changed from accepted to closed

(In [19283]) soundtouch SSE update. Fixes #5890

bit exact to C code, works for X86_32 patch by: foobum gmail com

Changed 9 years ago by foobum@…

comment:20 Changed 9 years ago by foobum@…

Sorry! Last diff, honest.. Missed a couple of lines in OverlapMulti(). This makes OverlapMulti()'s output bit exact too.

comment:21 Changed 9 years ago by janne

(In [19291]) make another soundtouch SSE function bit exact. Refs #5890

patch by: foobum gmail com

Add Comment

Modify Ticket

Action
as closed The owner will remain janne.
The resolution will be deleted. Next status will be 'new'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.