paweld/fpc - fpc - brudnopis.ovh

paweld/fpc

mirror of https://gitlab.com/freepascal.org/fpc/source.git synced 2025-11-02 11:54:52 +01:00

Author	SHA1	Message	Date
masta	aa21845cd9	Small optimization for OP_AND on ARM Especially with 64bit operators the CG sometimes generates: and r0, r1, #0 Which just clears r0 and is equivalent with mov r0, #0 git-svn-id: trunk@22032 -	2012-08-08 06:44:20 +00:00
florian	7513291ad8	* generate different code for OS_S8 -> OS_16 conversion which might fold better, idea by Nico Erfurth git-svn-id: trunk@22027 -	2012-08-07 19:36:46 +00:00
masta	6529307d9e	Don't emit useless AND/BICs in ARM CG In certain cases the CG would emit something like bic r1, r0, #0 As BIC is clearing the specified bits this is equivalent to mov r1, r0 This patch changes the CG to emit the mov instead which the register allocator will hopefully remove most of the time. git-svn-id: trunk@22024 -	2012-08-07 06:46:45 +00:00
masta	9e039936bf	Support more operators in FoldShiftProcess on ARM Now we can also fold shifts into teq, tst, cmp, cmn instructions. git-svn-id: trunk@22023 -	2012-08-07 06:46:32 +00:00
florian	f619a1aaf6	* fld/fst can have a base register+offset git-svn-id: trunk@22016 -	2012-08-05 18:34:13 +00:00
florian	e81ba0f82e	+ make use of the armv6+ sign/zero extension instructions if appropriate git-svn-id: trunk@22013 -	2012-08-05 14:04:11 +00:00
florian	eb1efdff8a	+ introduce cstylearrayofconst because pocall_mwcall was forgotten at several places git-svn-id: trunk@22012 -	2012-08-05 08:48:23 +00:00
florian	19ed835f2b	* don't generate an extra indirection when loading vfp constants git-svn-id: trunk@22010 -	2012-08-04 17:01:57 +00:00
masta	8a684c1f10	Don't generate IT instruction in second_cmp64bit for Thumb-2 Currently the register spiller can not handle the "bond" between IT* and a following instruction, sometimes breaking them apart, which breaks the build or worse the result. So for now we're not emitting A_IT* in second_cmp64bit anymore but use a conditional jump instead. This fixes Mantis #22520 git-svn-id: trunk@22009 -	2012-08-04 16:55:58 +00:00
masta	1c51b8d906	Disable 64bit shifts for thumb2 - Fix for Mantis #22520 In r21686 I've introduced optimized 64bit shifts for ARM. But the methods did not check for which machine it has to generate the code. This patch disables the optimized code for now if the target is in cpu_thumb2 and falls back to the generic code. There are 2 problems with the current code: 1.) Thumb-2 does not support shift by register on all data instruction as ARM does. 2.) The code does not generate the required IT-block for the conditional executed code. git-svn-id: trunk@21997 -	2012-08-02 00:56:21 +00:00
masta	c16871e129	Generate better code in Tthumb2cgarm.g_flags2reg The old code generated a strange IT-sequence: IT EQ MOVEQ r0, #1 IT NE MOVNE r0, #1 Now we generate: ITE EQ MOVEQ r0, #1 MOVNE r0, #1 IT stands for IfThen, ITE for IfThenElse it has a couple of other forms where the instruction gets extended to handle more of the following instructions. So we have ITEE, ITETE etc, up to 4 instructions can be handled. git-svn-id: trunk@21996 -	2012-08-02 00:56:15 +00:00
florian	023d632f44	* optimize also lsr/asr, lsl, lsr/asr sequences on arm git-svn-id: trunk@21981 -	2012-07-28 22:30:11 +00:00
florian	283afbcb07	* new controllers by lelekx, resolves #22523 git-svn-id: trunk@21980 -	2012-07-28 21:57:29 +00:00
florian	c5ad1bce7b	* avoid uncessary zero extensions in case code git-svn-id: trunk@21979 -	2012-07-28 20:09:21 +00:00
florian	c8435b503f	* better folding of consecutive shift operations git-svn-id: trunk@21978 -	2012-07-28 17:59:45 +00:00
florian	614afc1c8f	* pass march to GNU AS for cpu_armv6 and cpu_armv7 git-svn-id: trunk@21958 -	2012-07-23 20:20:17 +00:00
Jonas Maebe	0a1157da38	* fixed memory leaks in the compiler introduced in r21862 by marking and releasing temporarily created function result locations git-svn-id: trunk@21953 -	2012-07-23 13:49:29 +00:00
florian	d5aa89449e	* generate less register wasting code for 64 bit comparions git-svn-id: trunk@21950 -	2012-07-22 21:07:33 +00:00
masta	be6bf6e3f7	Fix possible access violation introduces in r21885 r21885 added a new peephole optimizer. The associated code refactoring missed a check for tai(hp1).typ = tai_instruction Which can lead to an access violation later on, because the rest of the code expects to find a taicpu in hp1. git-svn-id: trunk@21949 -	2012-07-22 18:06:08 +00:00
Jonas Maebe	3798b79fd7	+ optimization that (re)orders instance fields of Delphi-style classes in order to minimise memory losses due to alignment padding. Not yet enabled by default at any optimization level, but can be (de)activated separately via -Oo(no)orderfields o added separate tdef.structalignment method that returns the alignment of a type when it appears in a record/object/class (factors out AIX-specific double alignment in structs) o changed the handling of the offset of a delegate interface implemented via a field, by taking the field offset on demand rather than at declaration time (because the ordering optimization causes the offsets of fields to be unknown until the entire declaration has been parsed) git-svn-id: trunk@21947 -	2012-07-22 16:47:19 +00:00
masta	aa4fe66153	Fix ARM ASM-reader for MVN/CMP/CMN/TST/TEQ Like MOV these instructions support 2 operands, with the second beeing a shifterop. Without this patch the asm reader would fail on something like cmp r0, r1, lsr 16 with Error: Unknown identifier "LSR" git-svn-id: trunk@21911 -	2012-07-15 01:03:08 +00:00
masta	e2a744e19b	Consolidate do_spill_read/do_spill_written on arm ARM can not reference an arbitrary offset so it needs some special handling if the offset goes beyond abs(4095). The code for do_spill_read and do_spill written used to be very similar. I've partially factored out the code into spilling_create_load_store. The former code loaded the offset from a constant pool, which is a waste of memory-bandwidth and cache lines. The new code tries to find a way to adjust the baseregister so the memory location can be reached more easily, this allows us to handle at least +-1MB with just a single additional ADD or SUB instruction. If that fails we'll resort to the normal constant loading code, which on it's own will fallback to loading the constant from a constant-pool. So instead of: ldr r1, =16388 ldr r0, [r13, r1] which will at least uses 4 cycles (2 Instruction cycles + 2 stall cycles) on most cores. We try to generate: add r1, r13, #16384 ldr r0, [r1, #4] which most armv5+ cores will execute in 2 cycles. We'll also save on DCache usage. git-svn-id: trunk@21889 -	2012-07-12 01:11:23 +00:00
florian	701a5d76bb	* remove unneeded movs git-svn-id: trunk@21885 -	2012-07-11 20:58:52 +00:00
masta	57b67dfa30	Better SP adjustments on entry/exit for ARM If the needed adjustment is not expressible in a shifterconst, the old code loaded a temporary register (fixed to r12) via a_load_const_reg and used it to adjust the SP. Resulting in: mov r12, #44 orr r12, r12, #4096 sub sp, sp, r12 The new code will try to split the adjustment into 2 shifterconstants and will do two seperate adjustments: sub sp, sp, #44 sub sp, sp, #4096 If that doesn't work we'll fall back to the old code. But that should happen VERY rarely, only for stacks bigger than 256k which are not expressible in 2 shifter constants. git-svn-id: trunk@21863 -	2012-07-11 08:41:45 +00:00
florian	95732625cc	* use r11 as a normal register if no frame pointer is needed git-svn-id: trunk@21834 -	2012-07-09 17:17:23 +00:00
masta	dbf0404fb0	More consolidation of OP_SHL/SHR/ROR/SAR in ARM CodeGen This removes the duplications in a_op_reg_reg_reg_checkoverflow. OP_ROL stays seperate because it needs some special treatment again. The code for OP_ROL was changed, previously it generated: mov tempreg, #32 sub src1, tempreg, src1 mov dst, src2, ror src1 This would trash src1, which MIGHT be a problem, but i'm not totally sure. But the mov/sub was replaced with rsb, so the new code looks like this. rsb tempreg, src1, #32 mov dst, src2, ror tempreg If src1 gets freed afterwards the regallocator should be able to change that into: rsb src1, src1, #32 mov dst, src2, ror src1 git-svn-id: trunk@21804 -	2012-07-06 15:01:31 +00:00
masta	d2d5d17557	Consolidate handling of OP_SHL/SHR/ROL/ROR/SAR in ARM CodeGen The previous code was full with duplicated code, this new version just maps the OP_* to the correct SM_* and does some special handling for OP_ROL which is done via OP_ROR. git-svn-id: trunk@21801 -	2012-07-06 12:10:42 +00:00
masta	504a0ce0ca	Fix for Mantis #22326 This fixes 64bit shifts on arm with a constant shift value of 0. The old code would have emitted something like this mov r0, r0, lsl #32 as 32 is an invalid shift value (and would be wrong anyway) the assembler declined to assemble the produced source. The new code will just not emit any code for a shift value of 0. tests/test/tint642.pp now tests shl/shr 0 on 64 bit values. tests/webtbs/tw22326.pp is also added as an additional test. git-svn-id: trunk@21746 -	2012-07-01 08:09:00 +00:00
Jonas Maebe	7a0ae38700	+ also specify the parameter def when allocating a parameter via getintparaloc + adapted all call sites of getintparaloc. This led to a number of additional, related changes: o corrected the type information for some getintparaloc parameters o don't allocate some intparalocs in cases they aren't used o changed "const tvardata" parameter into "constref tvardata" for fpc_variant_copy_overwrite to make pass-by-reference semantics explicit o moved a number of routines that now have to call find_system_type() from cgobj to hlcgobj so that cgobj doesn't have to start depending on the symtable unit o added versions of the cpureg alloc/dealloc methods to hlcgobj that call through to their cgobj counter parts, so we can call save/restore the cpu registers before/after calling system helpers from hlcgobj (not implemented in hlcgobj itself, because all basic register allocator functionality is still part of cgobj/cgcpu) git-svn-id: trunk@21696 -	2012-06-24 15:02:12 +00:00
Jonas Maebe	c3ea451aea	* set tcgpara.vardef when creating parameter info git-svn-id: trunk@21693 -	2012-06-24 15:01:54 +00:00
Jonas Maebe	2d48396587	- removed redundant checks git-svn-id: trunk@21692 -	2012-06-24 15:01:48 +00:00
Jonas Maebe	587244c088	* factored out common code from get_funcretloc() * set tcgpara.def for the function return location (field introduced for and already used by the JVM code generator, required for future hlcg functionality) git-svn-id: trunk@21691 -	2012-06-24 15:01:42 +00:00
masta	ca70207bc0	Support 64-bit shifts on ARM. This code generate different versions of assembly depending on the amount to shift. Variable Amount: 6 cycles (5 if last shift can be folded) Constant 1 : 2 cycles Constant 2-31 : 3 cycles (2 if last shift can foldable) Constant 32 : 1 cycle (depends on the register allocator) Constant 33-64 : 2 cycles This should speed up softfpu on arm a bit. git-svn-id: trunk@21686 -	2012-06-23 20:36:27 +00:00
masta	3566956389	Fix ARM-Assembler output for RRX-Shifterops RRX (Rotate Right with eXtend) does a single bit right rotation through the carry. So it does not take any arguments, neither constant nor register. Also remove redundant shiftmode2str and replace usage of it with gas_shiftmode2str. git-svn-id: trunk@21685 -	2012-06-23 20:36:16 +00:00
masta	59c726c829	Support ABS intrinsic on ARM This code will generate the following sequence on arm: r1=dst r0=src movs r1, r0 rsbmi r1, r0, #0 movs will set the N-flag when the MSB of r0 is set, if it is set, rsb will calculate dst:=0-src; git-svn-id: trunk@21678 -	2012-06-21 20:12:36 +00:00
masta	aeb15ba2b6	Fixed postfix check in taicpu.is_same_reg_move The old version did not check the S-Postfix for MOV, which results in removing instructions like: movs r0, r0 which breaks later flag usage. git-svn-id: trunk@21676 -	2012-06-21 20:12:25 +00:00
masta	2768e0fc12	Folded Add/Sub/Or Splitter, lots of debug output git-svn-id: trunk@21660 -	2012-06-20 12:39:28 +00:00
masta	5498456269	Add LsrAndLsr Peephole Optimizer for ARM Remove the superfluous and in: mov r0, r0, lsr #24 and r0, r0, #255 Doing this allows for better shift-folding later git-svn-id: trunk@21659 -	2012-06-20 12:39:19 +00:00
masta	92c47148cc	Optimize 8/16 OP_NOT on ARM This now generates: mvn r0, r0, lsl #24/#16 mov r0, r0, lsr/asr #24/#16 The lsr/asr might be folded into a following instruction, making the whole operation 1 cycle instead of 2-3 with the previous solution. git-svn-id: trunk@21658 -	2012-06-20 12:39:09 +00:00
masta	0f3441a9c2	Split OP_ADD, OP_SUB, OP_AND and OP_ORR into multiple instructions if that can avoid constant construction or even loading from a pool. OP_ADD, OP_SUB, OP_ORR will be split into two intructions if possible when a load/const construction is required. OP_AND is a bit different, because we can't just split it up, but we try to find a two instruction BIC-equivalent to it. Till now code like a:= a and $FFFF; produced code like mov r0, $FF00 orr r0, r0, $FF and r1, r1, r0 With this addition we produce code like: bic r0, r0, $FF00 bic r0, r0, $FF Saving us at least a cycle and in some cases also a load from the constant-pool. This uses the new split_into_shifter_const function. git-svn-id: trunk@21647 -	2012-06-18 16:59:29 +00:00
masta	f11fbe527e	Improve loading of ARM constant values * use split_into_shifter_const to reduce the MOV/ORR combination to a single check and allow a broader rang of combinations. * Introduce MVN/BIC combination to load values which have more 1 than 0 bits set (like small negative values) git-svn-id: trunk@21646 -	2012-06-18 16:59:24 +00:00
masta	d987cee96a	Introduce split_into_shifter_const to ARM-Code Generator This functions tries to split up a 32-bit value into two shifter constants. This approach finds a broader range for two shifter constant combinations. git-svn-id: trunk@21645 -	2012-06-18 16:59:19 +00:00
masta	3205169ab9	Use roldword intrinsic instead of function rotl. These days we don't need the hand coded rol anymore. git-svn-id: trunk@21644 -	2012-06-18 16:59:13 +00:00
Jonas Maebe	0fc422f244	* moved definition of maxcpuregister and tcpuregisterset from cgbase to cgutils, and define them so they are no larger than what is required by the current target platform * added cgutils to the uses clause of several units that use the tcpuregisterset type git-svn-id: trunk@21624 -	2012-06-15 18:24:35 +00:00
Jonas Maebe	708a2532fc	* consistently define empty saved_mm_registers arrays as containing a single RS_INVALID superregister (instead of sometimes RS_NO and sometimes RS_INVALID) * check for RS_INVALID in tcg.g_save_registers() and ignore such entries git-svn-id: trunk@21622 -	2012-06-15 18:24:25 +00:00
florian	64ac48c815	* patch by Nico Erfurth: Better support for PLD on ARM git-svn-id: trunk@21572 -	2012-06-09 17:28:05 +00:00
florian	3db61ae52d	* patch by Nico Erfurth: Reworked regLoadedWithNewValue Added better support for A_STR, A_LDR, A_STM, A_LDM. Reworked the code the use a case statement for better readability. git-svn-id: trunk@21571 -	2012-06-09 17:27:30 +00:00
florian	03a30ff036	* patch by Nico Erfurth: Remove STRH and STRB from instructionLoadsFromReg STRH and STRB are not handled as sperate instructions by the code generator. git-svn-id: trunk@21570 -	2012-06-09 17:26:06 +00:00
florian	7599de416d	* patch by Nico Erfurth: Reworked MatchOperand in ARM Peephole Optimizers Added top_ref comperator which uses RefsEqual. Reworked the code for easier readability by using a case statement. git-svn-id: trunk@21569 -	2012-06-09 17:25:32 +00:00
florian	6e8594a9af	* patch by Nico Erfurth: Minor fix for FoldShiftProcess peephole optimizer on ARM Use UpdateUsedRegs and drop the check for reloading of the register, as this is done in RegUsedAfterInstruction now. git-svn-id: trunk@21520 -	2012-06-07 18:21:46 +00:00

1 2 3 4 5 ...

516 Commits